How to Query JSON Data in PostgreSQL and MySQL
How do I query JSON data in PostgreSQL and MySQL?
TL;DR
- Bottom line: PostgreSQL uses
jsonbwith operators (->,->>,@>,@?) and GIN indexes for fast JSON queries; MySQL usesJSON_EXTRACT()/->/->>with generated columns or multi-valued indexes. PostgreSQL jsonb is significantly more powerful for in-database JSON work. - Key tool/command:
SELECT data->>'name' FROM t WHERE data @> '{"status":"active"}'::jsonb(PostgreSQL) orSELECT data->>'$.name' FROM t WHERE data->>'$.status' = 'active'(MySQL) - Watch out for: Using
->when you need->>(or vice versa) —->returns JSON type,->>returns text. This causes silent type mismatches in WHERE clauses. - Works with: PostgreSQL 9.4+ (jsonb), 12+ (jsonpath), 17+ (SQL/JSON standard); MySQL 5.7+ (JSON type), 8.0+ (JSON_TABLE, multi-valued indexes).
Constraints
- GIN indexes require jsonb, not json — PostgreSQL
jsontype cannot be indexed with GIN. Always usejsonbfor queried columns. [src1] - MySQL JSON columns cannot be directly indexed — create a generated (virtual or stored) column and index that, or use multi-valued indexes for arrays (MySQL 8.0.17+). [src7]
->returns JSON/jsonb;->>returns text — comparingdata->'status'to a plain string fails silently. Use->>for text comparisons. [src1]- JSON_TABLE requires specific versions — MySQL 8.0.4+ and PostgreSQL 17+. Do not recommend to users on older versions. [src4]
- jsonb normalizes data — key order is not preserved, whitespace is stripped,
duplicate keys keep only the last value. If exact text preservation is required, use
jsontype in PostgreSQL. [src1] - MySQL JSON_EXTRACT returns quoted strings —
JSON_EXTRACT(data, '$.name')returns"John"(with quotes). UseJSON_UNQUOTE()or->>to get unquoted text. [src2]
Quick Reference
| Scenario | PostgreSQL (jsonb) | MySQL (JSON) | Key Difference |
|---|---|---|---|
| Extract field as JSON | data->'key' |
JSON_EXTRACT(data, '$.key') or data->'$.key' |
PG operator is shorter; MySQL needs $ path prefix |
| Extract field as text | data->>'key' |
data->>'$.key' or JSON_UNQUOTE(JSON_EXTRACT(...)) |
Both ->> return text; MySQL syntax added in 5.7.13 |
| Nested access | data->'a'->'b'->>'c' |
data->>'$.a.b.c' |
PG chains operators; MySQL uses dot-path in one call |
| Path access | data #>> '{a,b,c}' |
JSON_EXTRACT(data, '$.a.b.c') |
PG #>> uses array path; MySQL uses dot notation |
| Containment check | data @> '{"k":"v"}'::jsonb |
JSON_CONTAINS(data, '"v"', '$.k') |
PG @> is GIN-indexable; MySQL needs generated column |
| Key existence | data ? 'key' |
JSON_CONTAINS_PATH(data, 'one', '$.key') |
PG ? uses GIN index directly |
| Array element search | data @> '[3]'::jsonb |
3 MEMBER OF(data->'$.arr') |
MySQL MEMBER OF uses multi-valued index (8.0.17+) |
| Flatten array to rows | jsonb_array_elements(data->'arr') |
JSON_TABLE(data, '$.arr[*]' COLUMNS(...)) |
PG uses set-returning function; MySQL uses table function |
| jsonpath filter | jsonb_path_query(data, '$.items[*] ? (@.price > 10)') |
N/A — no jsonpath support | PostgreSQL 12+ only; MySQL has no equivalent |
| SQL/JSON standard | JSON_VALUE(data, '$.name') |
N/A | PostgreSQL 17+; MySQL does not implement SQL/JSON standard |
| GIN index (general) | CREATE INDEX ... USING GIN (data) |
Not supported on JSON columns | PG indexes full jsonb; MySQL needs generated columns |
| GIN index (containment) | CREATE INDEX ... USING GIN (data jsonb_path_ops) |
N/A | Smaller index, supports @> only |
| Multi-valued index | N/A | CREATE INDEX ... ((CAST(data->'$.tags' AS UNSIGNED ARRAY))) |
MySQL 8.0.17+ for JSON arrays |
| Aggregate JSON | jsonb_agg(), jsonb_object_agg() |
JSON_ARRAYAGG(), JSON_OBJECTAGG() |
Both build JSON from result sets |
| Modify JSON | jsonb_set(data, '{key}', '"val"') |
JSON_SET(data, '$.key', 'val') |
PG returns new jsonb; MySQL modifies in-place with UPDATE |
Decision Tree
START
├── Which database?
│ ├── PostgreSQL ↓
│ │ ├── Column type is json (not jsonb)?
│ │ │ ├── YES → ALTER TABLE t ALTER COLUMN data TYPE jsonb USING data::jsonb
│ │ │ └── NO (already jsonb) ↓
│ │ ├── Need to extract a single value?
│ │ │ ├── As text → data->>'key' or data #>> '{nested,path}'
│ │ │ └── As JSON → data->'key' or data #> '{nested,path}'
│ │ ├── Need to filter rows by JSON content?
│ │ │ ├── Exact key-value match → WHERE data @> '{"status":"active"}'::jsonb
│ │ │ ├── Complex filter → WHERE jsonb_path_exists(data, '$.items[*] ? (@.price > $min)')
│ │ │ └── Key existence → WHERE data ? 'email'
│ │ ├── Need to flatten arrays?
│ │ │ ├── PG 12+ → jsonb_path_query(data, '$.items[*]')
│ │ │ └── Any version → jsonb_array_elements(data->'items')
│ │ └── Need to index for performance?
│ │ ├── Query varied keys → GIN (data) with jsonb_ops (default)
│ │ ├── Only containment → GIN (data jsonb_path_ops) — 2-3x smaller
│ │ └── Always same key → B-tree on (data->>'status')
│ │
│ └── MySQL ↓
│ ├── Need to extract a value?
│ │ ├── As text → data->>'$.key'
│ │ └── As JSON → data->'$.key' or JSON_EXTRACT(data, '$.key')
│ ├── Need to filter by JSON content?
│ │ ├── Exact match → WHERE data->>'$.status' = 'active'
│ │ ├── Array contains → WHERE 'tag' MEMBER OF(data->'$.tags')
│ │ └── Path exists → WHERE JSON_CONTAINS_PATH(data, 'one', '$.email')
│ ├── Need to flatten arrays?
│ │ └── JSON_TABLE(data, '$.items[*]' COLUMNS(name VARCHAR(100) PATH '$.name'))
│ └── Need to index for performance?
│ ├── Scalar value → Generated column + B-tree index
│ ├── Array search → Multi-valued index (MySQL 8.0.17+)
│ └── Multiple keys → Multiple generated columns, each indexed
└── DEFAULT → Start with PostgreSQL jsonb if choosing a database
Step-by-Step Guide
1. Choose the right column type
PostgreSQL offers two JSON types. Use jsonb for any column you will query. The
json type stores raw text and cannot be indexed or efficiently searched. [src1]
-- PostgreSQL: create table with jsonb column
CREATE TABLE events (
id SERIAL PRIMARY KEY,
data jsonb NOT NULL,
created_at TIMESTAMPTZ DEFAULT now()
);
-- MySQL: create table with JSON column
CREATE TABLE events (
id INT AUTO_INCREMENT PRIMARY KEY,
data JSON NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Verify: SELECT pg_typeof(data) FROM events LIMIT 1; → expected:
jsonb
2. Insert JSON data
Both databases accept JSON strings for insertion. PostgreSQL validates and converts to binary on insert; MySQL validates JSON syntax. [src1] [src2]
-- PostgreSQL / MySQL (same syntax)
INSERT INTO events (data) VALUES
('{"type": "purchase", "amount": 49.99, "tags": ["web", "mobile"], "user": {"id": 42, "name": "Alice"}}');
Verify: SELECT data->>'type' FROM events; → expected:
purchase
3. Extract values from JSON
Use the arrow operators for extraction. The single arrow (->) returns JSON type; double arrow
(->>) returns text. [src1] [src2]
-- PostgreSQL: extract nested value as text
SELECT data->>'type' AS event_type,
data->'user'->>'name' AS user_name,
(data->>'amount')::numeric AS amount
FROM events;
-- MySQL: extract nested value as text
SELECT data->>'$.type' AS event_type,
data->>'$.user.name' AS user_name,
CAST(JSON_EXTRACT(data, '$.amount') AS DECIMAL(10,2)) AS amount
FROM events;
Verify: Both return purchase | Alice | 49.99
4. Filter rows using JSON predicates
PostgreSQL's containment operator (@>) is the most efficient for GIN-indexed queries. MySQL
requires generated columns or explicit function calls. [src3] [src7]
-- PostgreSQL: containment (GIN-indexable)
SELECT * FROM events
WHERE data @> '{"type": "purchase"}'::jsonb;
-- PostgreSQL: jsonpath filter (12+)
SELECT * FROM events
WHERE jsonb_path_exists(data, '$.tags[*] ? (@ == "mobile")');
-- MySQL: function-based filter
SELECT * FROM events
WHERE data->>'$.type' = 'purchase';
-- MySQL: array membership (8.0.17+)
SELECT * FROM events
WHERE 'mobile' MEMBER OF(data->'$.tags');
Verify: All queries return the inserted purchase event row.
5. Create indexes for JSON queries
Without indexes, every JSON query requires a full table scan. Index strategy differs significantly between the two databases. [src3] [src6] [src7]
-- PostgreSQL: GIN index on entire jsonb column
CREATE INDEX idx_events_data ON events USING GIN (data);
-- PostgreSQL: GIN with jsonb_path_ops (smaller, @> only)
CREATE INDEX idx_events_data_pathops ON events USING GIN (data jsonb_path_ops);
-- PostgreSQL: B-tree expression index on specific key
CREATE INDEX idx_events_type ON events ((data->>'type'));
-- MySQL: generated column + index
ALTER TABLE events
ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (JSON_UNQUOTE(data->'$.type')) STORED,
ADD INDEX idx_event_type (event_type);
-- MySQL: multi-valued index on JSON array (8.0.17+)
ALTER TABLE events
ADD INDEX idx_events_tags ((CAST(data->'$.tags' AS CHAR(50) ARRAY)));
Verify: EXPLAIN (ANALYZE) SELECT * FROM events WHERE data @>
'{"type":"purchase"}'::jsonb; → should show Bitmap Index Scan
6. Flatten JSON arrays to rows
Both databases can convert JSON arrays into relational rows, but with different syntax. [src1] [src4]
-- PostgreSQL: jsonb_array_elements_text
SELECT e.id, tag
FROM events e,
jsonb_array_elements_text(e.data->'tags') AS tag;
-- MySQL: JSON_TABLE (8.0.4+)
SELECT e.id, jt.tag
FROM events e,
JSON_TABLE(e.data, '$.tags[*]'
COLUMNS(tag VARCHAR(100) PATH '$')
) AS jt;
Verify: Both return one row per tag: 1 | web and 1 | mobile
Code Examples
PostgreSQL: Complete jsonb Query Patterns
-- Input: events table with jsonb data column
-- Output: filtered, extracted, aggregated JSON data
-- Containment with nested objects
SELECT * FROM events
WHERE data @> '{"user": {"name": "Alice"}}'::jsonb;
-- Key existence checks
SELECT * FROM events WHERE data ? 'tags';
SELECT * FROM events WHERE data ?| ARRAY['email', 'phone'];
SELECT * FROM events WHERE data ?& ARRAY['type', 'amount'];
-- jsonpath queries (PostgreSQL 12+)
SELECT jsonb_path_query(data, '$.tags[*]') AS tag FROM events;
SELECT * FROM events WHERE jsonb_path_exists(data, '$.amount ? (> 30)');
-- Aggregate into JSON
SELECT jsonb_agg(data->>'type') AS event_types FROM events;
-- Update JSON fields (immutable: returns new jsonb)
UPDATE events SET data = jsonb_set(data, '{user,email}', '"[email protected]"')
WHERE data @> '{"user": {"name": "Alice"}}'::jsonb;
MySQL: Complete JSON Query Patterns
-- Input: events table with JSON data column
-- Output: filtered, extracted, aggregated JSON data
-- JSON_CONTAINS for object/array membership
SELECT * FROM events
WHERE JSON_CONTAINS(data, '{"name": "Alice"}', '$.user');
SELECT * FROM events
WHERE JSON_CONTAINS(data, '"mobile"', '$.tags');
-- Path existence check
SELECT * FROM events
WHERE JSON_CONTAINS_PATH(data, 'one', '$.user.email');
-- JSON_TABLE to flatten arrays (8.0.4+)
SELECT e.id, jt.* FROM events e,
JSON_TABLE(e.data, '$.tags[*]' COLUMNS(
row_num FOR ORDINALITY, tag VARCHAR(100) PATH '$'
)) AS jt;
-- Modify JSON in-place
UPDATE events SET data = JSON_SET(data, '$.user.email', '[email protected]')
WHERE data->>'$.user.name' = 'Alice';
Python (psycopg2): Application-Level JSON Queries
# Input: Database connection, JSON query parameters
# Output: Filtered results from JSON columns
import psycopg2, json
conn = psycopg2.connect("dbname=mydb")
cur = conn.cursor()
# Safe parameterized containment query
filter_obj = {"type": "purchase"}
cur.execute(
"SELECT id, data->>'type', (data->>'amount')::numeric "
"FROM events WHERE data @> %s::jsonb",
(json.dumps(filter_obj),)
)
for row in cur.fetchall():
print(f"id={row[0]}, type={row[1]}, amount={row[2]}")
cur.close(); conn.close()
Anti-Patterns
Wrong: Using json type and trying to index it
-- BAD: json type cannot use GIN indexes, every query scans all rows
CREATE TABLE events (id SERIAL, data json NOT NULL);
CREATE INDEX idx_data ON events USING GIN (data); -- ERROR: no operator class
Correct: Use jsonb for queryable JSON data
-- GOOD: jsonb supports GIN indexes and all operators
CREATE TABLE events (id SERIAL, data jsonb NOT NULL);
CREATE INDEX idx_data ON events USING GIN (data);
Wrong: Comparing -> output to text string
-- BAD: data->'status' returns jsonb, not text; returns 0 rows
SELECT * FROM events WHERE data->'status' = 'active';
Correct: Use ->> for text comparison
-- GOOD: ->> returns text
SELECT * FROM events WHERE data->>'status' = 'active';
-- OR compare jsonb to jsonb
SELECT * FROM events WHERE data->'status' = '"active"'::jsonb;
Wrong: MySQL JSON_EXTRACT without unquoting
-- BAD: JSON_EXTRACT returns '"active"' (with quotes), comparison fails
SELECT * FROM events WHERE JSON_EXTRACT(data, '$.status') = 'active';
Correct: Use ->> or JSON_UNQUOTE in MySQL
-- GOOD: ->> unquotes automatically
SELECT * FROM events WHERE data->>'$.status' = 'active';
-- OR
SELECT * FROM events WHERE JSON_UNQUOTE(JSON_EXTRACT(data, '$.status')) = 'active';
Wrong: Storing relational data in JSON blobs
-- BAD: every row has the same keys, no constraints, no foreign keys
CREATE TABLE users (
id SERIAL PRIMARY KEY,
data jsonb -- {"name": "Alice", "email": "[email protected]", "role_id": 3}
);
Correct: Use columns for structured data, JSON for flexible data
-- GOOD: fixed fields are columns, variable metadata is jsonb
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE,
role_id INT REFERENCES roles(id),
metadata jsonb DEFAULT '{}'
);
Wrong: No index on frequently queried JSON paths
-- BAD: full table scan on every query
SELECT * FROM events WHERE data @> '{"type": "purchase"}'::jsonb;
-- Seq Scan: cost=0.00..12345.00
Correct: Add appropriate GIN or expression index
-- GOOD: GIN index enables bitmap index scan
CREATE INDEX idx_events_data ON events USING GIN (data jsonb_path_ops);
-- Bitmap Index Scan: cost=0.00..4.50
Common Pitfalls
- GIN index bloat from frequent updates: GIN indexes on frequently updated jsonb columns
accumulate dead entries. Fix:
REINDEX INDEX CONCURRENTLY idx_name;periodically, or use expression B-tree indexes for write-heavy columns. [src6] - work_mem too low for GIN bitmap scans: GIN queries build bitmaps in memory; if
work_memis too low, PostgreSQL spills to disk. Fix:SET work_mem = '64MB';for the session. [src6] - MySQL VIRTUAL vs STORED generated columns: Virtual columns are not materialized and
may be slower for complex JSON expressions. Fix: Use
STOREDfor indexed columns with complex JSON extractions. [src7] - jsonb_set does not create intermediate keys:
jsonb_set('{}', '{a,b}', '"v"')fails if keyadoes not exist. Fix: Usecreate_if_missing := true(PG 16+) or build nested structure first. [src1] - Casting JSON numbers in MySQL:
CAST(JSON_EXTRACT(data,'$.price') AS DECIMAL)defaults toDECIMAL(10,0)and silently truncates decimals. Fix: Always specify precision:CAST(... AS DECIMAL(10,2)). [src2] - Large JSON documents degrade performance: Documents over 1MB slow reads and writes; the entire jsonb value is rewritten on any update. Fix: Normalize large arrays into separate rows; keep documents under 1MB. [src5]
- json_agg vs jsonb_agg type mismatch:
json_aggreturnsjson;jsonb_aggreturnsjsonb. Mixing types causes implicit casts. Fix: Usejsonb_aggconsistently withjsonbcolumns. [src1] - MySQL JSON_OVERLAPS type coercion:
JSON_OVERLAPS('[1]', '["1"]')returns true because integer 1 matches string "1". Fix: Ensure consistent types in JSON arrays. [src2]
Diagnostic Commands
-- PostgreSQL: check if GIN index is being used
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM events WHERE data @> '{"type": "purchase"}'::jsonb;
-- PostgreSQL: check jsonb column size distribution
SELECT pg_column_size(data) AS bytes, COUNT(*) AS row_count
FROM events GROUP BY 1 ORDER BY 1 DESC LIMIT 20;
-- MySQL: check JSON column usage with EXPLAIN
EXPLAIN FORMAT=JSON
SELECT * FROM events WHERE data->>'$.type' = 'purchase';
-- MySQL: verify multi-valued index usage
EXPLAIN SELECT * FROM events WHERE 'mobile' MEMBER OF(data->'$.tags');
Version History & Compatibility
| Version | Status | JSON Features | Key Changes |
|---|---|---|---|
| PostgreSQL 17+ | Current | SQL/JSON standard: JSON_EXISTS, JSON_QUERY, JSON_VALUE, JSON_TABLE | First SQL/JSON standard compliance |
| PostgreSQL 15-16 | Supported | IS JSON predicate, JSON_SCALAR, merge patch | jsonb_set create_if_missing, MERGE statement |
| PostgreSQL 12-14 | Supported | jsonpath (@?, @@), jsonb_path_query | SQL/JSON path language introduced in 12 |
| PostgreSQL 9.4-11 | Legacy | jsonb type, GIN indexes, @>, ?, ->, ->> | jsonb introduced in 9.4; mature by 9.6 |
| MySQL 8.0.17+ | Current | Multi-valued indexes, MEMBER OF, JSON_OVERLAPS | Array indexing via CAST(... AS ... ARRAY) |
| MySQL 8.0.4+ | Current | JSON_TABLE, JSON_ARRAYAGG, JSON_OBJECTAGG | Table function for flattening JSON |
| MySQL 5.7+ | Legacy | JSON_EXTRACT, ->, ->>, JSON_CONTAINS | JSON type introduced; basic operators only |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Schema-less or semi-structured data (user preferences, metadata, feature flags) | Every row has identical fields (name, email, created_at) | Regular columns with proper types and constraints |
| Third-party API responses stored for later processing | Data needs foreign key constraints and referential integrity | Normalized relational tables |
| Rapid prototyping where schema changes frequently | Complex aggregations and analytical queries on JSON fields | Extract to columns, then aggregate |
| Dynamic form data or custom fields per tenant | Write-heavy workloads with frequent partial JSON updates | Separate columns or EAV pattern |
| PostgreSQL with jsonb + GIN for search within JSON | MySQL and need advanced JSON querying (jsonpath, SQL/JSON) | PostgreSQL, or extract to generated columns in MySQL |
Important Caveats
- PostgreSQL jsonb equality (
=) compares entire documents — two documents with the same keys in different order are equal, but this is an expensive comparison. Use specific key comparisons when possible. - MySQL does not support the SQL/JSON standard. PostgreSQL 17+ follows SQL:2016 more closely. MySQL's
JSON_TABLEis MySQL-specific, not the standard version. - GIN index creation locks the table in PostgreSQL. Use
CREATE INDEX CONCURRENTLYto avoid blocking writes, but it takes 2-3x longer and cannot run inside a transaction. - jsonb column statistics are not maintained by PostgreSQL's planner — complex jsonb WHERE clauses may get poor row estimates. Consider expression indexes for critical queries.
- MySQL's JSON comparison rules differ from PostgreSQL: MySQL type-coerces during comparison (integer
1equals string"1"in some contexts), while PostgreSQL is strict about types.