Skip to main content

Applying migrations safely

A migration that passes Squawk's lint is not automatically safe to run.

To safely apply a migration you must set a lock_timeout in Postgres. See below for more information.

safety requirements

  1. Use Squawk to lint your migrations. Follow the advice.
  2. Set a short lock_timeout (e.g. 2 seconds) within Postgres when running your migrations. If you hit the lock_timeout, great, retry your migration until it succeeds.

lock_timeout

You must configure a lock_timeout within Postgres while running your migrations. Without this your migration could block other writes while trying to grab a lock on the table.

You can set this timeout in a few ways:

  • globally in Postgres (see below)
  • for a specific database role
  • via the options connection parameter of your client
  • with the PGOPTIONS environment variable if your client supports it. Python's pyscogpg2 supports this option. (see below)
-- set the global Postgres lock timeout to 1 second
SET lock_timeout = '1s';
PGOPTIONS='-c lock_timeout=10s' ./migrate_db

With a short lock_timeout of 1 second, queries will be blocked for up to 1 second. If your migration hits the lock timeout, it will be cancelled and error, allowing the waiting queries to proceed. You should retry a migration that hits the lock timeout until it succeeds.

example

This statement requires an AccessExclusiveLock lock on "accounts". Reads and writes to "accounts" will be blocked while this statement is running.

ALTER TABLE "accounts" ADD CONSTRAINT "positive_balance" CHECK ("balance" >= 0) NOT VALID

If there is a long running query or open transaction, this ALTER TABLE statement could be blocked while it waits for an AccessExclusiveLock lock. While the statement waits for a lock, all other reads and writes sent after will wait for this statement too, bringing your application to a halt.

With a short lock_timeout of 1 second, queries will be blocked for up to 1 second. If your migration hits the lock timeout, it will be cancelled and error, allowing the waiting queries to proceed. You should retry a migration that hits the lock timeout until it succeeds.

further reading

Benchling's "Move fast and migrate things: how we automated migrations in Postgres" and GoCardless's "Zero-downtime Postgres migrations - the hard parts" provide more background on lock_timeout and statement_timeout in a production environment.