pgloader mysql://user:pass@mysql-host/dbname postgresql://user:pass@pg-host/dbnameTINYINT(1) to PostgreSQL BOOLEAN conversion, AUTO_INCREMENT to SERIAL/IDENTITY mapping, zero-date values ('0000-00-00'), and case-sensitivity differences in string comparisons.'0000-00-00' and '0000-00-00 00:00:00') must be cleaned or cast to NULL before/during migration — PostgreSQL rejects them outright. Use pgloader's zero-dates-to-null casting function. [src1]include drop against a production PostgreSQL database that already contains data you want to keep — it drops and recreates all target tables. [src1]ENUM columns are migrated to VARCHAR by pgloader — create PostgreSQL native ENUM types manually and alter columns after migration if needed. [src1, src6]reset sequences in pgloader or manual setval()) — failing to do so causes duplicate key errors on the next INSERT without explicit ID. [src1]| MySQL Syntax | PostgreSQL Equivalent | Example |
|---|---|---|
AUTO_INCREMENT | SERIAL / GENERATED ALWAYS AS IDENTITY | id SERIAL PRIMARY KEY |
TINYINT(1) | BOOLEAN | is_active BOOLEAN DEFAULT true |
DATETIME | TIMESTAMP / TIMESTAMPTZ | created_at TIMESTAMPTZ DEFAULT NOW() |
INT UNSIGNED | BIGINT (no unsigned in PG) | counter BIGINT CHECK (counter >= 0) |
DOUBLE | DOUBLE PRECISION | price DOUBLE PRECISION |
BLOB / MEDIUMBLOB / LONGBLOB | BYTEA | file_data BYTEA |
TEXT / MEDIUMTEXT / LONGTEXT | TEXT (no size limit in PG) | content TEXT |
ENUM('a','b','c') | Custom TYPE or CHECK | CREATE TYPE status AS ENUM ('a','b','c') |
SET('a','b','c') | TEXT[] (array) or junction table | tags TEXT[] DEFAULT '{}' |
LIMIT 5, 10 | LIMIT 10 OFFSET 5 | SELECT * FROM t LIMIT 10 OFFSET 5 |
IFNULL(a, b) | COALESCE(a, b) | SELECT COALESCE(name, 'Unknown') |
IF(cond, a, b) | CASE WHEN cond THEN a ELSE b END | SELECT CASE WHEN x>0 THEN 'pos' ELSE 'neg' END |
NOW() | NOW() (identical) | SELECT NOW() |
DATE_ADD(d, INTERVAL 1 DAY) | d + INTERVAL '1 day' | SELECT created_at + INTERVAL '7 days' |
GROUP_CONCAT(col) | STRING_AGG(col, ',') | SELECT STRING_AGG(name, ', ') FROM t |
SHOW TABLES | \dt or information_schema.tables | SELECT table_name FROM information_schema.tables WHERE table_schema='public' |
backtick quoting `column` | double-quote quoting "column" | SELECT "Column" FROM "Table" |
REGEXP '^pattern' | ~ '^pattern' | SELECT * FROM t WHERE col ~ '^A' |
START
├── Database size < 10 GB and acceptable downtime window?
│ ├── YES → Use pgloader direct migration (single command)
│ └── NO ↓
├── Need near-zero downtime (CDC/replication)?
│ ├── YES → Use AWS DMS with CDC (or DMS Serverless for auto-scaling)
│ └── NO ↓
├── Complex stored procedures, triggers, or views?
│ ├── YES → Use AWS SCT (with AI-assisted conversion) for schema + pgloader for data
│ └── NO ↓
├── Running on AWS infrastructure?
│ ├── YES → Use AWS DMS Serverless + Schema Conversion Tool
│ └── NO ↓
├── Running on Azure infrastructure?
│ ├── YES → Use Azure Database Migration Service
│ └── NO ↓
├── Want a no-code/low-code approach?
│ ├── YES → Use Skyvia or similar SaaS migration tool
│ └── NO ↓
└── DEFAULT → pgloader with .load configuration file for custom type mappings
Inventory your database schema, data volume, stored procedures, triggers, and views. This determines your migration approach and identifies potential issues early. [src3, src5]
-- Count tables, views, procedures, and triggers
SELECT 'Tables' AS object_type, COUNT(*) AS count
FROM information_schema.tables
WHERE table_schema = 'mydb' AND table_type = 'BASE TABLE'
UNION ALL
SELECT 'Views', COUNT(*) FROM information_schema.views WHERE table_schema = 'mydb'
UNION ALL
SELECT 'Procedures', COUNT(*) FROM information_schema.routines WHERE routine_schema = 'mydb'
UNION ALL
SELECT 'Triggers', COUNT(*) FROM information_schema.triggers WHERE trigger_schema = 'mydb';
-- Check total data size
SELECT table_schema AS db,
ROUND(SUM(data_length + index_length) / 1024 / 1024, 2) AS size_mb
FROM information_schema.tables
WHERE table_schema = 'mydb'
GROUP BY table_schema;
-- List columns using MySQL-specific types
SELECT table_name, column_name, column_type
FROM information_schema.columns
WHERE table_schema = 'mydb'
AND (column_type LIKE '%unsigned%'
OR column_type LIKE 'enum%'
OR column_type LIKE 'set%'
OR column_type = 'tinyint(1)');
Verify: You should have a clear inventory — number of tables, total data size, and a list of columns requiring special type handling.
Install pgloader and create an empty target database in PostgreSQL. [src1]
# Ubuntu/Debian
sudo apt-get install pgloader
# macOS
brew install pgloader
# Docker (recommended for consistent versioning)
docker pull dimitri/pgloader:latest
# Create the target PostgreSQL database
psql -U postgres -c "CREATE DATABASE mydb_pg ENCODING 'UTF8' LC_COLLATE 'en_US.UTF-8';"
Verify: pgloader --version returns 3.6.x and psql -U postgres -d mydb_pg -c "SELECT 1" connects successfully.
Start with a dry run to surface type and constraint mismatches before moving data. Iterate on CAST rules until the dry run completes without rejects. [src1, src3]
# Dry run — shows what pgloader would do without executing
pgloader --dry-run mysql://mysql_user:password@mysql-host:3306/mydb \
postgresql://pg_user:password@pg-host:5432/mydb_pg
# Simple one-line test migration
pgloader mysql://mysql_user:password@mysql-host:3306/mydb \
postgresql://pg_user:password@pg-host:5432/mydb_pg
# Or with Docker
docker run --rm -it dimitri/pgloader:latest \
pgloader mysql://mysql_user:password@mysql-host:3306/mydb \
postgresql://pg_user:password@pg-host:5432/mydb_pg
Verify: pgloader prints a summary table showing rows loaded per table. Check for errors: grep -i 'error\|warning' pgloader.log.
For production migrations, use a .load file with explicit CAST rules to control type mapping precisely. [src1]
-- migration.load
LOAD DATABASE
FROM mysql://mysql_user:password@mysql-host:3306/mydb
INTO postgresql://pg_user:password@pg-host:5432/mydb_pg
WITH include drop,
create tables,
create indexes,
reset sequences,
workers = 8,
concurrency = 4,
multiple readers per thread,
rows per range = 50000
SET PostgreSQL PARAMETERS
maintenance_work_mem to '512MB',
work_mem to '64MB'
CAST type tinyint to boolean using tinyint-to-boolean,
type int when unsigned to bigint drop typemod,
type bigint with extra auto_increment to bigserial,
type int with extra auto_increment to serial,
type datetime to timestamptz drop default using zero-dates-to-null,
type date drop default using zero-dates-to-null
BEFORE LOAD DO
$$ CREATE SCHEMA IF NOT EXISTS mydb; $$;
Verify: pgloader migration.load completes with zero errors. Compare row counts between source and target.
Compare row counts, check key data types, and verify constraints were created correctly. [src5, src6]
-- Check row counts per table
SELECT schemaname, relname, n_live_tup
FROM pg_stat_user_tables ORDER BY n_live_tup DESC;
-- Verify sequences are in sync
SELECT sequencename, last_value FROM pg_sequences WHERE schemaname = 'public';
-- Check indexes were created
SELECT indexname, tablename, indexdef
FROM pg_indexes WHERE schemaname = 'public' ORDER BY tablename;
-- Verify foreign keys
SELECT conname, conrelid::regclass, confrelid::regclass
FROM pg_constraint WHERE contype = 'f';
Verify: Row counts match between MySQL and PostgreSQL. Sequence last_value >= max ID in each table.
pgloader does not migrate stored procedures, triggers, or views. Rewrite them in PL/pgSQL. AWS DMS Schema Conversion Tool now offers AI-assisted conversion for complex stored procedures. [src2, src4, src8]
-- MySQL stored procedure
DELIMITER //
CREATE PROCEDURE get_active_users()
BEGIN
SELECT * FROM users WHERE is_active = 1;
END //
DELIMITER ;
-- PostgreSQL equivalent function
CREATE OR REPLACE FUNCTION get_active_users()
RETURNS SETOF users AS $$
BEGIN
RETURN QUERY SELECT * FROM users WHERE is_active = true;
END;
$$ LANGUAGE plpgsql;
-- MySQL trigger
CREATE TRIGGER before_user_update
BEFORE UPDATE ON users FOR EACH ROW
SET NEW.updated_at = NOW();
-- PostgreSQL equivalent (function + trigger)
CREATE OR REPLACE FUNCTION update_timestamp()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER before_user_update
BEFORE UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION update_timestamp();
Verify: SELECT proname FROM pg_proc WHERE pronamespace = 'public'::regnamespace; lists your functions. Test each procedure.
Switch your application to PostgreSQL-compatible SQL. Key syntax changes are in the Quick Reference table. [src5, src6]
# BEFORE: MySQL connection (Python)
import mysql.connector
conn = mysql.connector.connect(host='mysql-host', database='mydb', user='user', password='pass')
# AFTER: PostgreSQL connection (Python)
import psycopg2
conn = psycopg2.connect(host='pg-host', dbname='mydb_pg', user='user', password='pass')
# Key SQL changes:
# MySQL: SELECT * FROM users LIMIT 5, 10
# PG: SELECT * FROM users LIMIT 10 OFFSET 5
# MySQL: SELECT IFNULL(name, 'N/A') FROM users
# PG: SELECT COALESCE(name, 'N/A') FROM users
Verify: Run your application test suite. Check for SQL errors in logs.
Full script: pgloader-production-migration-with-custom-cast-rul.txt (34 lines)
-- Input: MySQL database with AUTO_INCREMENT, TINYINT(1) booleans, ENUMs, unsigned ints
-- Output: Fully migrated PostgreSQL database with correct types
LOAD DATABASE
FROM mysql://root:[email protected]:3306/production
INTO postgresql://pgadmin:[email protected]:5432/production_pg
WITH include drop,
create tables,
create indexes,
reset sequences,
workers = 8,
concurrency = 4,
multiple readers per thread,
rows per range = 50000
SET PostgreSQL PARAMETERS
maintenance_work_mem to '1GB',
work_mem to '128MB',
search_path to 'public'
CAST type tinyint to boolean using tinyint-to-boolean,
type int when unsigned to bigint drop typemod,
type smallint when unsigned to integer drop typemod,
type bigint when unsigned to numeric drop typemod,
type bigint with extra auto_increment to bigserial,
type int with extra auto_increment to serial,
type datetime to timestamptz drop default using zero-dates-to-null,
type date drop default using zero-dates-to-null,
type tinyblob to bytea using byte-vector-to-bytea,
type mediumblob to bytea using byte-vector-to-bytea,
type longblob to bytea using byte-vector-to-bytea
INCLUDING ONLY TABLE NAMES MATCHING ~/users/, ~/orders/, ~/products/
BEFORE LOAD DO
$$ DROP SCHEMA IF EXISTS public CASCADE; $$,
$$ CREATE SCHEMA public; $$;
Full script: python-automated-migration-validation-script.py (46 lines)
# Input: MySQL and PostgreSQL connection details
# Output: Comparison report of row counts and schema differences
import mysql.connector
import psycopg2
def validate_migration(mysql_config: dict, pg_config: dict) -> dict:
"""Compare row counts and detect schema mismatches after migration."""
mysql_conn = mysql.connector.connect(**mysql_config)
pg_conn = psycopg2.connect(**pg_config)
mysql_cur = mysql_conn.cursor()
pg_cur = pg_conn.cursor()
# Get MySQL tables and row counts
mysql_cur.execute("""
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = %s AND table_type = 'BASE TABLE'
""", (mysql_config['database'],))
mysql_tables = dict(mysql_cur.fetchall())
# Get PostgreSQL tables and row counts
pg_cur.execute("""
SELECT relname, n_live_tup
FROM pg_stat_user_tables
WHERE schemaname = 'public'
""")
pg_tables = dict(pg_cur.fetchall())
report = {'matched': [], 'mismatched': [], 'missing_in_pg': []}
for table, mysql_count in mysql_tables.items():
pg_count = pg_tables.get(table)
if pg_count is None:
report['missing_in_pg'].append(table)
elif abs(mysql_count - pg_count) <= 1:
report['matched'].append(table)
else:
report['mismatched'].append({
'table': table,
'mysql_rows': mysql_count,
'pg_rows': pg_count
})
mysql_conn.close()
pg_conn.close()
return report
result = validate_migration(
mysql_config={'host': 'mysql-host', 'database': 'mydb', 'user': 'root', 'password': 'pass'},
pg_config={'host': 'pg-host', 'dbname': 'mydb_pg', 'user': 'postgres', 'password': 'pass'}
)
print(f"Matched: {len(result['matched'])}, Mismatched: {len(result['mismatched'])}, Missing: {len(result['missing_in_pg'])}")
Full script: bash-end-to-end-migration-script-with-validation.sh (31 lines)
#!/usr/bin/env bash
# Input: MySQL source and PostgreSQL target connection details
# Output: Migrated and validated PostgreSQL database
set -euo pipefail
MYSQL_URI="mysql://root:password@mysql-host:3306/mydb"
PG_URI="postgresql://postgres:password@pg-host:5432/mydb_pg"
LOAD_FILE="migration.load"
echo "=== Step 1: Create target database ==="
psql "${PG_URI%/*}/postgres" -c "DROP DATABASE IF EXISTS mydb_pg;"
psql "${PG_URI%/*}/postgres" -c "CREATE DATABASE mydb_pg ENCODING 'UTF8';"
echo "=== Step 2: Run pgloader ==="
pgloader "$LOAD_FILE" 2>&1 | tee pgloader_output.log
echo "=== Step 3: Check for errors ==="
if grep -qi 'error' pgloader_output.log; then
echo "ERRORS DETECTED - review pgloader_output.log"
exit 1
fi
echo "=== Step 4: Validate row counts ==="
psql "$PG_URI" -c "
SELECT schemaname, relname AS table_name, n_live_tup AS row_count
FROM pg_stat_user_tables
WHERE schemaname = 'public'
ORDER BY n_live_tup DESC;"
echo "=== Step 5: Verify sequences ==="
psql "$PG_URI" -c "
SELECT sequencename, last_value
FROM pg_sequences
WHERE schemaname = 'public';"
echo "=== Step 6: Run ANALYZE ==="
psql "$PG_URI" -c "ANALYZE;"
echo "=== Migration complete ==="
-- BAD — MySQL allows '0000-00-00' dates, PostgreSQL rejects them
INSERT INTO events (event_date) VALUES ('0000-00-00');
-- ERROR: date/time field value out of range: "0000-00-00"
-- GOOD — Clean up in MySQL before migration
UPDATE events SET event_date = NULL WHERE event_date = '0000-00-00';
-- Or use pgloader CAST rule to handle automatically:
-- CAST type datetime to timestamptz using zero-dates-to-null
-- CAST type date drop default using zero-dates-to-null
-- BAD — Works in MySQL (case-insensitive by default) but fails in PostgreSQL
SELECT * FROM users WHERE username = 'JohnDoe';
-- PostgreSQL returns no rows if stored as 'johndoe'
-- GOOD — Explicit case-insensitive comparison in PostgreSQL
SELECT * FROM users WHERE username ILIKE 'JohnDoe';
-- Or use the citext extension for the column type
CREATE EXTENSION IF NOT EXISTS citext;
ALTER TABLE users ALTER COLUMN username TYPE citext;
-- BAD — MySQL-only syntax that breaks in PostgreSQL
SELECT * FROM orders LIMIT 10, 20; -- MySQL offset,limit
INSERT INTO users SET name='Alice', age=30; -- MySQL SET syntax
SELECT IFNULL(email, 'none') FROM users; -- MySQL function
SELECT * FROM users WHERE name REGEXP '^A'; -- MySQL regex
-- GOOD — PostgreSQL-compatible syntax
SELECT * FROM orders LIMIT 20 OFFSET 10; -- Standard LIMIT/OFFSET
INSERT INTO users (name, age) VALUES ('Alice', 30); -- Standard INSERT
SELECT COALESCE(email, 'none') FROM users; -- Standard COALESCE
SELECT * FROM users WHERE name ~ '^A'; -- PostgreSQL regex (~)
-- BAD — Sequences not updated after bulk data load
INSERT INTO users (name) VALUES ('New User');
-- ERROR: duplicate key value violates unique constraint "users_pkey"
-- (sequence returns 1, which already exists)
Full script: correct-reset-sequences-to-match-max-id-values.sql (25 lines)
-- GOOD — Reset all sequences after migration
-- pgloader does this with `reset sequences` option
-- Manual reset for a specific table:
SELECT setval(
pg_get_serial_sequence('users', 'id'),
COALESCE(MAX(id), 1)
) FROM users;
-- Reset ALL sequences:
DO $$
DECLARE r RECORD;
BEGIN
FOR r IN
SELECT c.table_name, c.column_name,
pg_get_serial_sequence(c.table_name, c.column_name) AS seq
FROM information_schema.columns c
WHERE c.column_default LIKE 'nextval%'
AND c.table_schema = 'public'
LOOP
EXECUTE format(
'SELECT setval(%L, COALESCE(MAX(%I), 1)) FROM %I',
r.seq, r.column_name, r.table_name
);
END LOOP;
END $$;
-- BAD — 'include drop' drops existing tables in the target database
-- If the target already has data you want to keep, it's gone
LOAD DATABASE
FROM mysql://user:pass@src/mydb
INTO postgresql://user:pass@dst/production_pg
WITH include drop -- THIS WILL DROP ALL MATCHING TABLES
-- GOOD — Use a fresh target database or omit 'include drop'
LOAD DATABASE
FROM mysql://user:pass@src/mydb
INTO postgresql://user:pass@dst/mydb_pg_staging
WITH create tables,
create indexes,
reset sequences
-- Migrate to a staging database, validate, then swap
TINYINT(1) to BOOLEAN by default, but TINYINT(2), TINYINT(3), etc. become SMALLINT. If your app uses TINYINT(4) for booleans, add explicit CAST rule. [src1]INT UNSIGNED (0 to 4,294,967,295) does not exist in PostgreSQL. Use BIGINT and add a CHECK (col >= 0) constraint. [src5, src6]utf8 is actually 3-byte (utf8mb3). PostgreSQL uses proper UTF-8 (4-byte). Convert MySQL to utf8mb4 before migration. Fix: ALTER DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; [src3]'0000-00-00'. PostgreSQL rejects them. Fix: Use pgloader's zero-dates-to-null casting function or clean data before migration. [src1]downcase identifiers option and avoid quoted identifiers where possible. [src2]reset sequences in pgloader or manually run setval(). [src1]SET data type has no PostgreSQL equivalent. Fix: Use PostgreSQL arrays (TEXT[]) or a junction table. [src6]# Check pgloader version
pgloader --version
# Dry run — show what pgloader would do without executing
pgloader --dry-run migration.load
# Verify PostgreSQL connection
psql -U postgres -h pg-host -d mydb_pg -c "SELECT version();"
# Compare table counts between MySQL and PostgreSQL
mysql -u root -p -e "SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema='mydb' ORDER BY table_rows DESC;"
psql -U postgres -d mydb_pg -c "SELECT relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;"
# Check for sequences out of sync
psql -U postgres -d mydb_pg -c "SELECT sequencename, last_value FROM pg_sequences WHERE schemaname='public';"
# Verify all constraints migrated
psql -U postgres -d mydb_pg -c "SELECT conname, contype, conrelid::regclass FROM pg_constraint WHERE connamespace='public'::regnamespace;"
# Check encoding
psql -U postgres -d mydb_pg -c "SHOW server_encoding;"
# Check PostgreSQL version
psql -U postgres -d mydb_pg -c "SELECT version();"
# Run ANALYZE after migration
psql -U postgres -d mydb_pg -c "ANALYZE VERBOSE;"
| Version | Status | Key Features | Migration Notes |
|---|---|---|---|
| PostgreSQL 17 (2024) | Current | JSON_TABLE, incremental backup, identity improvements | Preferred target — all pgloader features supported |
| PostgreSQL 16 (2023) | Stable | Logical replication improvements, pg_stat_io | Fully supported by pgloader 3.6.x |
| PostgreSQL 15 (2022) | Stable | MERGE command, JSON logging, public schema changes | Good target — MERGE simplifies app logic |
| PostgreSQL 14 (2021) | Maintenance | Multirange types, LZ4 compression | Supported, but consider upgrading target |
| PostgreSQL 13 (2020) | EOL Nov 2025 | Incremental sort, deduplication | Avoid as migration target — end of life |
| MySQL 8.4 LTS (2024) | Current LTS | Long-term support, deprecates mysql_native_password | Latest MySQL source — fully supported |
| MySQL 8.0 (2018) | Maintenance | Window functions, CTEs, JSON improvements | Most common source version |
| pgloader 3.6.9 | Current | MySQL 8.x support, improved CAST | Recommended version |
| AWS DMS 3.5.x | Current | CDC, multi-AZ, DMS Serverless, AI-assisted SCT | Use for managed cloud migrations |
| Azure DMS | Current | Online and offline migration modes | Use for Azure-hosted targets |
| Use When | Don't Use When | Use Instead |
|---|---|---|
| Need advanced features: JSONB, arrays, CTEs, window functions | Simple CRUD app working fine on MySQL | Stay on MySQL |
| Scaling beyond MySQL replication limits | Database < 1 GB with no complex queries | mysqldump + manual conversion |
| Moving to AWS Aurora PostgreSQL or Supabase | Need MySQL-specific features (spatial with MySQL GIS) | MySQL 8.x with InnoDB Cluster |
| Application already uses standard SQL | Team has deep MySQL expertise, no PG experience | Invest in MySQL tuning first |
| Cost optimization (PostgreSQL is fully open source) | Tight timeline with zero tolerance for downtime | Run both in parallel with CDC first |
| Need PostGIS for geospatial data | Only need basic geospatial queries | MySQL 8.x with spatial indexes |
utf8 charset is actually 3-byte (utf8mb3). PostgreSQL's UTF-8 is 4-byte. Ensure MySQL source uses utf8mb4 before migration to avoid data loss with emoji and CJK characters.= comparisons, use ILIKE, the citext extension, or add COLLATE "und-x-icu" rules.AUTO_INCREMENT gaps in MySQL are preserved as-is. PostgreSQL sequences continue from the max existing value after reset sequences, not from the MySQL auto_increment counter value.ENUM columns are migrated to VARCHAR by pgloader. If you need PostgreSQL native ENUMs, create the types manually and alter columns after migration.