SQL MERGE


SQL
POSTGRES
INTERACTIVE

Complete SQL MERGE/UPSERT Technical Guide

1. Fundamental MERGE Syntax
-- Standard MERGE (ANSI SQL)
MERGE INTO target_table t
USING source_table s
ON (t.key = s.key)
WHEN MATCHED THEN
  UPDATE SET t.column1 = s.column1, t.column2 = s.column2
WHEN NOT MATCHED THEN
  INSERT (key, column1, column2) VALUES (s.key, s.column1, s.column2);

-- PostgreSQL UPSERT alternative
INSERT INTO target_table (key, column1, column2)
VALUES ('k1', 'v1', 'v2')
ON CONFLICT (key) DO UPDATE
SET column1 = EXCLUDED.column1, column2 = EXCLUDED.column2;
  1. MERGE: Single statement for INSERT, UPDATE, and DELETE operations
  2. UPSERT: INSERT with conflict resolution (PostgreSQL, SQLite)
  3. Atomic operation: Entire statement succeeds or fails as a unit
  4. Source flexibility: Can use tables, views, subqueries, or literal values
2. Practical Applications
Data Synchronization
-- Sync customer data from staging to production
MERGE INTO customers t
USING staging_customers s
ON (t.customer_id = s.customer_id)
WHEN MATCHED AND t.last_modified < s.last_modified THEN
  UPDATE SET 
    t.name = s.name,
    t.email = s.email,
    t.last_modified = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
  INSERT (customer_id, name, email, last_modified)
  VALUES (s.customer_id, s.name, s.email, CURRENT_TIMESTAMP);
Conditional Logic
-- Only update active records, archive inactive
MERGE INTO employee_directory t
USING hr_updates s
ON (t.employee_id = s.employee_id)
WHEN MATCHED AND s.status = 'TERMINATED' THEN
  DELETE
WHEN MATCHED AND s.status = 'ACTIVE' THEN
  UPDATE SET 
    t.department = s.department,
    t.title = s.title
WHEN NOT MATCHED AND s.status = 'ACTIVE' THEN
  INSERT (employee_id, name, department, title)
  VALUES (s.employee_id, s.name, s.department, s.title);
3. Advanced MERGE Patterns
Multiple Match Conditions
-- Different update logic based on conditions
MERGE INTO product_inventory t
USING inventory_updates s
ON (t.product_id = s.product_id)
WHEN MATCHED AND s.quantity < 5 THEN
  UPDATE SET 
    t.quantity = s.quantity,
    t.status = 'CRITICAL'
WHEN MATCHED AND s.quantity < 20 THEN
  UPDATE SET 
    t.quantity = s.quantity,
    t.status = 'LOW'
WHEN MATCHED THEN
  UPDATE SET 
    t.quantity = s.quantity,
    t.status = 'OK'
WHEN NOT MATCHED THEN
  INSERT (product_id, quantity, status)
  VALUES (s.product_id, s.quantity, 'NEW');
Using Subqueries as Source
-- Complex source transformation
MERGE INTO monthly_sales t
USING (
  SELECT 
    product_id,
    DATE_TRUNC('month', sale_date) AS month,
    SUM(amount) AS total_sales,
    COUNT(*) AS transaction_count
  FROM sales
  GROUP BY product_id, DATE_TRUNC('month', sale_date)
) s
ON (t.product_id = s.product_id AND t.month = s.month)
WHEN MATCHED THEN
  UPDATE SET 
    t.total_sales = s.total_sales,
    t.transaction_count = s.transaction_count,
    t.last_updated = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN
  INSERT (product_id, month, total_sales, transaction_count, last_updated)
  VALUES (s.product_id, s.month, s.total_sales, s.transaction_count, CURRENT_TIMESTAMP);
4. Vendor-Specific Syntax
PostgreSQL UPSERT
-- Basic conflict resolution
INSERT INTO users (id, username, email)
VALUES (1, 'john_doe', 'john@example.com')
ON CONFLICT (id) DO UPDATE
SET username = EXCLUDED.username,
    email = EXCLUDED.email,
    updated_at = NOW();

-- Conditional update on conflict
INSERT INTO products (sku, name, price, stock)
VALUES ('ABC123', 'Widget', 19.99, 100)
ON CONFLICT (sku) DO UPDATE
SET price = EXCLUDED.price,
    stock = products.stock + EXCLUDED.stock
WHERE products.price != EXCLUDED.price;
MySQL INSERT...ON DUPLICATE KEY UPDATE
-- Basic upsert
INSERT INTO page_views (page_id, view_count)
VALUES (123, 1)
ON DUPLICATE KEY UPDATE
view_count = view_count + 1;

-- Multiple value upsert
INSERT INTO inventory (item_id, warehouse_id, quantity)
VALUES (1, 1, 10), (2, 1, 5), (3, 2, 8)
ON DUPLICATE KEY UPDATE
quantity = VALUES(quantity);
5. Performance & Best Practices
  1. Index join columns: Ensure the ON clause columns are indexed
  2. Batch operations: Process multiple rows in single MERGE when possible
  3. Filter early: Apply conditions in the USING clause rather than in WHEN clauses
  4. Transaction size: Break large MERGE operations into batches
  5. Error handling: Implement proper exception handling for constraint violations
Optimization Example
-- Efficient bulk merge with filtered source
MERGE INTO customer_profiles t
USING (
  SELECT * FROM customer_updates
  WHERE last_modified > (SELECT MAX(last_sync) FROM sync_log WHERE table_name = 'customer_profiles')
) s
ON (t.customer_id = s.customer_id)
WHEN MATCHED THEN
  UPDATE SET 
    t.preferences = s.preferences,
    t.segment = s.segment,
    t.version = t.version + 1
WHEN NOT MATCHED THEN
  INSERT (customer_id, preferences, segment, version)
  VALUES (s.customer_id, s.preferences, s.segment, 1);

Key Additional Insights:

Concurrency Control

  1. MERGE statements typically use row-level locking
  2. Consider isolation levels for high-contention scenarios
  3. Deadlocks can occur with multiple MERGE operations

Returning Modified Data

-- PostgreSQL example returning affected rows
INSERT INTO inventory (item_id, stock)
VALUES (1, 100)
ON CONFLICT (item_id) DO UPDATE
SET stock = inventory.stock + EXCLUDED.stock
RETURNING item_id, stock;

Alternatives for Non-Supported Databases

-- SQLite alternative with separate statements
BEGIN TRANSACTION;
UPDATE table SET col1 = 'value' WHERE key = 'k1';
INSERT INTO table (key, col1) SELECT 'k1', 'value' 
WHERE changes() = 0;
COMMIT;

Conditional Logic Depth

  1. Most databases support multiple WHEN MATCHED clauses
  2. Evaluation order matters - first matching condition executes
  3. Some databases limit total number of WHEN clauses

Error Handling Variations

  1. MySQL: ON DUPLICATE KEY UPDATE ignores most errors
  2. SQL Server: MERGE can fail on constraint violations
  3. Oracle: DML error logging clauses can capture errors

Audit Trail Considerations

  1. MERGE operations can complicate trigger behavior
  2. Consider adding modified_by/modified_at columns
  3. Some databases offer change data capture (CDC) for MERGE