SQL Upsert Patterns: ON CONFLICT, MERGE, and ON DUPLICATE KEY across Databases
How do I implement upsert (ON CONFLICT / MERGE) across databases?
TL;DR
- Bottom line: Every major RDBMS supports atomic upsert, but the syntax differs significantly -- use
INSERT ... ON CONFLICTfor PostgreSQL/SQLite,INSERT ... ON DUPLICATE KEY UPDATEfor MySQL, andMERGEfor SQL Server/Oracle (and PostgreSQL 15+). - Key tool/command:
INSERT INTO t (id, val) VALUES (1, 'x') ON CONFLICT (id) DO UPDATE SET val = EXCLUDED.val; - Watch out for: Race conditions in hand-rolled IF EXISTS/UPDATE/ELSE/INSERT patterns -- always use the database's native atomic upsert syntax.
- Works with: PostgreSQL 9.5+, MySQL 5.7+, SQL Server 2008+, SQLite 3.24+, Oracle 9i+.
Constraints
- Upsert requires a UNIQUE constraint or PRIMARY KEY on the conflict column(s) -- without one, duplicates will be inserted silently
- SQL Server MERGE has known bugs with indexed views and temporal tables -- avoid MERGE with DELETE actions or target temporal tables [src6]
- MySQL ON DUPLICATE KEY UPDATE checks all unique indexes, not just the one you intend -- tables with multiple unique indexes can produce unexpected updates [src2]
- All upsert patterns are subject to race conditions under concurrent writes without proper isolation level or locking [src7]
- PostgreSQL ON CONFLICT requires specifying the exact conflict target (column or constraint name) -- omitting it is a syntax error [src1]
Quick Reference
| Database | Syntax | Introduced | Atomic | Supports DELETE | Conflict Detection | Excluded Row Access | Bulk Support |
|---|---|---|---|---|---|---|---|
| PostgreSQL | INSERT ... ON CONFLICT DO UPDATE | 9.5 (2016) | Yes | No | Named constraint or column list | EXCLUDED.col | Yes (multi-row VALUES) |
| PostgreSQL | MERGE ... WHEN MATCHED/NOT MATCHED | 15 (2022) | Yes | Yes (v15+) | JOIN condition | Source table alias | Yes (FROM source) |
| MySQL | INSERT ... ON DUPLICATE KEY UPDATE | 4.1 (2004) | Yes | No | All unique indexes | VALUES(col) (deprecated) / alias (8.0.19+) | Yes (multi-row VALUES) |
| MariaDB | INSERT ... ON DUPLICATE KEY UPDATE | 5.1 | Yes | No | All unique indexes | VALUES(col) | Yes (multi-row VALUES) |
| SQL Server | MERGE ... WHEN MATCHED/NOT MATCHED | 2008 | Yes* | Yes | JOIN condition (ON clause) | Source table alias | Yes (FROM source) |
| Oracle | MERGE ... WHEN MATCHED/NOT MATCHED | 9i (2001) | Yes | Yes (10g+) | JOIN condition (ON clause) | Source table alias | Yes (FROM source) |
| SQLite | INSERT ... ON CONFLICT DO UPDATE | 3.24 (2018) | Yes | No | Column list or constraint | excluded.col | Yes (multi-row VALUES) |
| CockroachDB | INSERT ... ON CONFLICT DO UPDATE | 2.0 (2017) | Yes | No | Column list | excluded.col | Yes (multi-row VALUES) |
| Standard SQL | MERGE (SQL:2003/2008) | SQL:2003 | Yes | Yes (SQL:2008) | JOIN condition | Source reference | Yes |
*SQL Server MERGE is atomic per statement but has known concurrency bugs -- see Anti-Patterns section.
Decision Tree
START
+-- Which database engine?
| +-- PostgreSQL
| | +-- Version >= 15 and need DELETE capability?
| | | +-- YES --> Use MERGE (SQL:2003 standard syntax)
| | | +-- NO --> Use INSERT ... ON CONFLICT DO UPDATE
| | +-- Need to handle multiple conflict targets?
| | +-- YES --> Use ON CONFLICT ON CONSTRAINT constraint_name
| | +-- NO --> Use ON CONFLICT (column_list) DO UPDATE
| +-- MySQL / MariaDB
| | +-- Version >= 8.0.19?
| | | +-- YES --> Use INSERT ... ON DUPLICATE KEY UPDATE with AS alias
| | | +-- NO --> Use INSERT ... ON DUPLICATE KEY UPDATE with VALUES(col)
| | +-- Table has multiple unique indexes?
| | +-- YES --> CAUTION: all unique indexes are checked; consider restructuring
| | +-- NO --> Safe to use ON DUPLICATE KEY UPDATE
| +-- SQL Server
| | +-- Simple single-row upsert?
| | | +-- YES --> Consider IF EXISTS pattern with UPDLOCK, SERIALIZABLE hints
| | | +-- NO (bulk/multi-row) --> Use MERGE ... WHEN MATCHED/NOT MATCHED
| | +-- Target table has indexed views?
| | +-- YES --> DO NOT use MERGE -- use IF EXISTS pattern instead
| | +-- NO --> MERGE is safe (avoid DELETE clause if possible)
| +-- SQLite
| | +-- Version >= 3.24?
| | | +-- YES --> Use INSERT ... ON CONFLICT DO UPDATE (same as PostgreSQL)
| | | +-- NO --> Use INSERT OR REPLACE (loses non-specified column values)
| +-- Oracle
| +-- Use MERGE ... WHEN MATCHED/NOT MATCHED THEN
+-- END
Step-by-Step Guide
1. Identify the conflict key
Determine which column(s) define uniqueness for the upsert operation. This must be a PRIMARY KEY or have a UNIQUE constraint/index. [src1]
-- Ensure the conflict target has a unique constraint
ALTER TABLE products ADD CONSTRAINT uq_products_sku UNIQUE (sku);
Verify: SELECT constraint_name FROM information_schema.table_constraints WHERE table_name = 'products' AND constraint_type IN ('PRIMARY KEY', 'UNIQUE'); --> should list your constraint.
2. Write the upsert statement for your database
Choose the correct syntax based on your engine. Below uses PostgreSQL as the primary example. [src1]
-- PostgreSQL: Insert or update a product by SKU
INSERT INTO products (sku, name, price, updated_at)
VALUES ('ABC-123', 'Widget Pro', 29.99, NOW())
ON CONFLICT (sku)
DO UPDATE SET
name = EXCLUDED.name,
price = EXCLUDED.price,
updated_at = EXCLUDED.updated_at;
Verify: SELECT * FROM products WHERE sku = 'ABC-123'; --> should show the latest values regardless of whether it was an insert or update.
3. Handle conditional updates (update only if data changed)
Avoid unnecessary writes by adding a WHERE clause to the DO UPDATE. [src1]
-- PostgreSQL: Only update if price actually changed
INSERT INTO products (sku, name, price, updated_at)
VALUES ('ABC-123', 'Widget Pro', 29.99, NOW())
ON CONFLICT (sku)
DO UPDATE SET
name = EXCLUDED.name,
price = EXCLUDED.price,
updated_at = EXCLUDED.updated_at
WHERE products.price IS DISTINCT FROM EXCLUDED.price;
Verify: Check xmax system column -- if xmax = 0, the row was inserted; if xmax > 0, it was updated. SELECT xmax, * FROM products WHERE sku = 'ABC-123';
4. Implement bulk upsert
For inserting/updating many rows at once, use multi-row VALUES or a source subquery. [src1]
-- PostgreSQL: Bulk upsert from VALUES
INSERT INTO products (sku, name, price, updated_at)
VALUES
('ABC-123', 'Widget Pro', 29.99, NOW()),
('DEF-456', 'Gadget Plus', 49.99, NOW()),
('GHI-789', 'Tool Max', 19.99, NOW())
ON CONFLICT (sku)
DO UPDATE SET
name = EXCLUDED.name,
price = EXCLUDED.price,
updated_at = EXCLUDED.updated_at;
Verify: SELECT count(*) FROM products WHERE sku IN ('ABC-123', 'DEF-456', 'GHI-789'); --> should return 3.
5. Add RETURNING clause for feedback (PostgreSQL)
Use RETURNING to know which rows were affected and how. [src1]
-- PostgreSQL: Get affected rows back
INSERT INTO products (sku, name, price, updated_at)
VALUES ('ABC-123', 'Widget Pro', 34.99, NOW())
ON CONFLICT (sku)
DO UPDATE SET
price = EXCLUDED.price,
updated_at = EXCLUDED.updated_at
RETURNING sku, xmax = 0 AS was_inserted;
Verify: Result set shows was_inserted = true for new rows, false for updates.
Code Examples
PostgreSQL: Single-Row Upsert with ON CONFLICT
-- Input: A row to insert or update based on unique email
-- Output: The affected row with insert/update indicator
INSERT INTO users (email, display_name, login_count, last_login)
VALUES ('[email protected]', 'Alice', 1, NOW())
ON CONFLICT (email)
DO UPDATE SET
display_name = EXCLUDED.display_name,
login_count = users.login_count + 1, -- increment existing counter
last_login = EXCLUDED.last_login
RETURNING *;
MySQL: Upsert with ON DUPLICATE KEY UPDATE
-- Input: A row to insert or update based on unique email
-- Output: Affected rows count (1 = inserted, 2 = updated, 0 = unchanged)
-- MySQL 8.0.19+ syntax using alias:
INSERT INTO users (email, display_name, login_count, last_login)
VALUES ('[email protected]', 'Alice', 1, NOW())
AS new_row
ON DUPLICATE KEY UPDATE
display_name = new_row.display_name,
login_count = users.login_count + 1,
last_login = new_row.last_login;
-- MySQL < 8.0.19 syntax using VALUES():
INSERT INTO users (email, display_name, login_count, last_login)
VALUES ('[email protected]', 'Alice', 1, NOW())
ON DUPLICATE KEY UPDATE
display_name = VALUES(display_name),
login_count = login_count + 1,
last_login = VALUES(last_login);
SQL Server: MERGE for Upsert
-- Input: A row to insert or update based on unique email
-- Output: Affected rows (use OUTPUT clause for details)
MERGE INTO users AS target
USING (VALUES ('[email protected]', 'Alice', 1, GETDATE()))
AS source (email, display_name, login_count, last_login)
ON target.email = source.email
WHEN MATCHED THEN
UPDATE SET
display_name = source.display_name,
login_count = target.login_count + 1,
last_login = source.last_login
WHEN NOT MATCHED THEN
INSERT (email, display_name, login_count, last_login)
VALUES (source.email, source.display_name, source.login_count, source.last_login)
OUTPUT $action, inserted.*;
-- Note: MERGE statement MUST end with semicolon
SQLite: Upsert with ON CONFLICT
-- Input: A row to insert or update based on unique email
-- Output: Row is inserted or updated atomically
INSERT INTO users (email, display_name, login_count, last_login)
VALUES ('[email protected]', 'Alice', 1, datetime('now'))
ON CONFLICT (email)
DO UPDATE SET
display_name = excluded.display_name,
login_count = users.login_count + 1,
last_login = excluded.last_login;
-- Note: SQLite uses lowercase 'excluded' (case-insensitive but convention differs)
Python (psycopg2): Parameterized Bulk Upsert
# Input: List of (email, name, count) tuples
# Output: All rows upserted atomically with parameters
import psycopg2
from psycopg2.extras import execute_values
conn = psycopg2.connect("dbname=mydb")
cur = conn.cursor()
data = [
("[email protected]", "Alice", 1),
("[email protected]", "Bob", 1),
]
execute_values(
cur,
"""
INSERT INTO users (email, display_name, login_count, last_login)
VALUES %s
ON CONFLICT (email)
DO UPDATE SET
display_name = EXCLUDED.display_name,
login_count = users.login_count + 1,
last_login = NOW()
""",
[(e, n, c, ) for e, n, c in data],
template="(%s, %s, %s, NOW())",
)
conn.commit()
Node.js (Knex.js): Cross-Database Upsert Helper
// Input: Table name, row data, conflict columns
// Output: Upsert executed via Knex query builder
// knex v2.5+ supports .onConflict().merge()
const knex = require("knex")({ client: "pg", connection: process.env.DATABASE_URL });
await knex("users")
.insert({
email: "[email protected]",
display_name: "Alice",
login_count: 1,
last_login: knex.fn.now(),
})
.onConflict("email")
.merge({
display_name: "Alice",
login_count: knex.raw("users.login_count + 1"),
last_login: knex.fn.now(),
});
Anti-Patterns
Wrong: Check-then-act (SELECT then INSERT or UPDATE)
-- BAD -- race condition: two sessions can both see "not exists" and both INSERT
IF NOT EXISTS (SELECT 1 FROM users WHERE email = '[email protected]')
INSERT INTO users (email, display_name) VALUES ('[email protected]', 'Alice');
ELSE
UPDATE users SET display_name = 'Alice' WHERE email = '[email protected]';
Correct: Use native atomic upsert
-- GOOD -- atomic operation, no race window
INSERT INTO users (email, display_name)
VALUES ('[email protected]', 'Alice')
ON CONFLICT (email)
DO UPDATE SET display_name = EXCLUDED.display_name;
Wrong: INSERT OR REPLACE when you want upsert (SQLite)
-- BAD -- INSERT OR REPLACE deletes the old row and inserts a new one
-- This resets auto-increment IDs, triggers ON DELETE, and loses columns not in INSERT
INSERT OR REPLACE INTO users (email, display_name)
VALUES ('[email protected]', 'Alice');
-- login_count and last_login are now NULL/default!
Correct: Use ON CONFLICT DO UPDATE to preserve existing data
-- GOOD -- preserves all existing column values, only updates specified ones
INSERT INTO users (email, display_name)
VALUES ('[email protected]', 'Alice')
ON CONFLICT (email)
DO UPDATE SET display_name = excluded.display_name;
-- login_count and last_login remain unchanged
Wrong: MERGE without semicolon terminator (SQL Server)
-- BAD -- SQL Server requires MERGE to end with semicolon; omitting it causes
-- parse errors with subsequent statements
MERGE INTO users AS target
USING (VALUES ('[email protected]', 'Alice')) AS source (email, display_name)
ON target.email = source.email
WHEN MATCHED THEN UPDATE SET display_name = source.display_name
WHEN NOT MATCHED THEN INSERT (email, display_name) VALUES (source.email, source.display_name)
-- Missing semicolon! Next statement will fail with cryptic error
Correct: Always terminate MERGE with semicolon
-- GOOD -- semicolon terminates the MERGE statement
MERGE INTO users AS target
USING (VALUES ('[email protected]', 'Alice')) AS source (email, display_name)
ON target.email = source.email
WHEN MATCHED THEN UPDATE SET display_name = source.display_name
WHEN NOT MATCHED THEN INSERT (email, display_name) VALUES (source.email, source.display_name);
Wrong: Using MERGE on SQL Server tables with indexed views
-- BAD -- SQL Server MERGE has known bugs with indexed views
-- Can produce wrong results or errors (Microsoft Connect bugs #775684, #713699)
MERGE INTO orders AS target -- orders has an indexed view
USING staging AS source ON target.order_id = source.order_id
WHEN MATCHED THEN UPDATE SET amount = source.amount
WHEN NOT MATCHED THEN INSERT (order_id, amount) VALUES (source.order_id, source.amount);
Correct: Use explicit IF EXISTS pattern with locking hints
-- GOOD -- safe pattern for SQL Server when indexed views are involved
BEGIN TRANSACTION;
UPDATE orders WITH (UPDLOCK, SERIALIZABLE)
SET amount = @amount
WHERE order_id = @order_id;
IF @@ROWCOUNT = 0
BEGIN
INSERT INTO orders (order_id, amount) VALUES (@order_id, @amount);
END;
COMMIT;
Common Pitfalls
- Multiple unique indexes on MySQL: ON DUPLICATE KEY UPDATE fires on ANY unique constraint violation, not just the one you expect. If a table has both
UNIQUE(email)andUNIQUE(phone), inserting a row with existing phone but new email triggers the update on the wrong match. Fix: Restructure to a single composite unique index or use separate lookup tables. [src2] - EXCLUDED vs VALUES() confusion: PostgreSQL uses
EXCLUDED.column, MySQL < 8.0.19 usesVALUES(column), MySQL 8.0.19+ uses an alias. Mixing them up produces syntax errors. Fix: Check your database version and use the correct syntax. [src1] [src2] - Missing WHERE on DO UPDATE: Without a filter, every conflict triggers an update even when data hasn't changed, causing unnecessary WAL writes, trigger firings, and replication traffic. Fix: Add
WHERE t.col IS DISTINCT FROM EXCLUDED.col. [src1] - MERGE without HOLDLOCK/SERIALIZABLE (SQL Server): Under concurrency, MERGE can throw primary key violations because the check-and-insert are not truly serializable by default. Fix: Add
WITH (HOLDLOCK)to the target table or wrap in a SERIALIZABLE transaction. [src6] [src7] - Forgetting RETURNING (PostgreSQL): Without RETURNING, you cannot tell whether a row was inserted or updated. Fix: Append
RETURNING *, xmax = 0 AS was_insertedto distinguish inserts from updates. [src1] - UPSERT affecting auto-increment counters (MySQL): ON DUPLICATE KEY UPDATE still increments the auto_increment counter even when it performs an update, leading to gaps in IDs. Fix: Accept ID gaps as normal or use a natural key. [src2]
- SQLite excluded is case-insensitive but lowercase by convention: While
EXCLUDED.colworks, the SQLite documentation uses lowercaseexcluded.col. Fix: Use lowercase for consistency; both work. [src4] - Deadlocks in high-concurrency bulk upserts: Concurrent bulk upserts on overlapping key ranges can deadlock due to index locking order. Fix: Sort rows by primary key before upserting to ensure consistent lock acquisition order. [src6]
Version History & Compatibility
| Database | Feature | Version | Status | Notes |
|---|---|---|---|---|
| PostgreSQL | ON CONFLICT DO UPDATE | 9.5 (2016-01) | Stable | Primary upsert mechanism |
| PostgreSQL | ON CONFLICT DO NOTHING | 9.5 (2016-01) | Stable | Skip conflicting rows |
| PostgreSQL | MERGE | 15 (2022-10) | Stable | SQL:2003 standard compliance |
| PostgreSQL | MERGE RETURNING | 17 (2024-09) | Current | RETURNING clause support |
| MySQL | ON DUPLICATE KEY UPDATE | 4.1 (2004) | Stable | VALUES() syntax |
| MySQL | ON DUPLICATE KEY with alias | 8.0.19 (2020-01) | Current | AS new_row replaces deprecated VALUES() |
| SQL Server | MERGE | 2008 | Stable* | *Known bugs with indexed views |
| SQLite | ON CONFLICT (upsert) | 3.24 (2018-06) | Stable | Same syntax as PostgreSQL |
| Oracle | MERGE | 9i (2001) | Stable | DELETE clause added in 10g |
When to Use / When Not to Use
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Idempotent writes (APIs that may retry) | Simple insert of always-new data | Plain INSERT |
| Syncing external data into a local table | Full bidirectional sync with deletes needed | ETL pipeline with MERGE DELETE or CDC |
| Configuration/settings tables (set key=value) | High-write OLTP with no natural conflict | INSERT (faster, no conflict check overhead) |
| Counters and accumulators (increment on conflict) | Complex multi-table transactional logic | Stored procedure with explicit transaction |
| Cache tables (insert-or-refresh pattern) | Table has no unique constraint on conflict columns | Add UNIQUE constraint first, then upsert |
| Staging data loads (deduplicate on load) | SQL Server table with indexed views | IF EXISTS/UPDATE/INSERT with locking hints |
Important Caveats
- PostgreSQL
ON CONFLICTandMERGEare NOT interchangeable: ON CONFLICT is triggered by constraint violations during INSERT; MERGE performs a JOIN-based match. ON CONFLICT cannot DELETE rows; MERGE (v15+) can. Use ON CONFLICT for simple upserts, MERGE for complex sync logic. - SQL Server MERGE has a long history of bugs documented by Aaron Bertrand (sqlblog.org/merge). While many have been fixed, avoid using MERGE with DELETE actions or against temporal tables. For simple upserts, the IF EXISTS pattern with UPDLOCK, SERIALIZABLE is often safer.
- MySQL's
VALUES()function in ON DUPLICATE KEY UPDATE is deprecated since 8.0.20 -- use theAS aliassyntax instead:INSERT INTO t ... AS new ON DUPLICATE KEY UPDATE col = new.col. - Under READ COMMITTED isolation (the default in PostgreSQL and SQL Server), concurrent upserts to the same key can both succeed as inserts if timed precisely. PostgreSQL ON CONFLICT handles this internally with speculative insertion and retry. Other databases may require explicit locking.
- Bulk upserts with thousands of rows should sort input by primary key to minimize deadlock risk from inconsistent lock ordering across concurrent transactions.