MSSQL to PostgreSQL: Common Pitfalls and How to Avoid Them
Migrating from Microsoft SQL Server (MSSQL) to PostgreSQL can reduce licensing costs and increase portability, but differences between the systems can cause subtle bugs, performance regressions, or data loss. Below are the most common pitfalls and concrete steps to avoid them.
1. Data type mismatches
- Issue: MSSQL and PostgreSQL use different type names, sizes, and behaviors (e.g., DATETIME vs TIMESTAMP, MONEY vs NUMERIC, BIT vs BOOLEAN, NVARCHAR vs TEXT).
- How to avoid:
- Inventory all column types and map them explicitly (e.g., DATETIME -> TIMESTAMP WITHOUT TIME ZONE or TIMESTAMP WITH TIME ZONE if timezone needed; MONEY -> NUMERIC(19,4); BIT -> BOOLEAN; NVARCHAR -> TEXT or VARCHAR(n)).
- Preserve precision/scale for DECIMAL/NUMERIC types.
- Test range and nullability edge cases with representative datasets.
2. Identity/auto-increment behavior
- Issue: MSSQL uses IDENTITY, PostgreSQL typically uses SERIAL/GENERATED or sequences. Differences in sequence ownership and value after bulk loads can cause duplicate key errors.
- How to avoid:
- Convert IDENTITY to GENERATED ALWAYS/ALWAYS AS IDENTITY or create sequences and set column defaults.
- After bulk inserts, set sequence values with setval() to the max existing id: SELECT setval(‘my_seq’, COALESCE((SELECT MAX(id) FROM my_table), 1));
- Use RETURNING with inserts to retrieve generated values reliably.
3. Transaction isolation and locking semantics
- Issue: Default isolation levels and locking behaviors differ; long-running operations can behave differently, causing deadlocks or blocking.
- How to avoid:
- Review transaction boundaries in application code and prefer short transactions.
- Use explicit FOR UPDATE / SELECT … FOR SHARE when needed.
- Test high-concurrency scenarios; consider tuning max_connections, statement_timeout, and vacuum settings.
4. SQL dialect and T-SQL features
- Issue: T-SQL procedural code (stored procedures, functions, TRY…CATCH, MERGE, proprietary functions) won’t run as-is in PostgreSQL.
- How to avoid:
- Inventory all stored procedures, triggers, and functions.
- Rewrite logic in PL/pgSQL or another supported language (PL/pgSQL, PL/Python).
- Replace TRY…CATCH with EXCEPTION blocks; implement equivalent MERGE logic using INSERT … ON CONFLICT or use upsert patterns.
- Use regexp_replace, string_agg, jsonb functions, and other PostgreSQL built-ins as replacements for MSSQL-specific functions.
5. Query plan and performance differences
- Issue: Execution plans and optimizers differ; indexes and queries tuned for MSSQL may perform poorly in PostgreSQL.
- How to avoid:
- Capture slow queries and analyze with EXPLAIN (ANALYZE).
- Reassess indexing strategy: consider expression indexes, partial indexes, and BRIN indexes for large, append-only tables.
- Tune configuration parameters (shared_buffers, work_mem, maintenance_work_mem, effective_cache_size).
- Use VACUUM and ANALYZE regularly; schedule autovacuum tuning for workload.
6. Collation and case-sensitivity
- Issue: Collation defaults and case-sensitivity differ; text comparisons, ordering, and unique constraints may behave