Skip to main content

Migration Strategy Skill

Migration Strategy Skill

When to Use This Skill

Use this skill when implementing migration strategy patterns in your codebase.

How to Use This Skill

  1. Review the patterns and examples below
  2. Apply the relevant patterns to your implementation
  3. Follow the best practices outlined in this skill

Database migration patterns for safe, zero-downtime schema changes with comprehensive rollback capabilities.

Migration Principles

Golden Rules

  1. Always backward compatible - Old code must work with new schema
  2. Deployable independently - Migrations deploy before code changes
  3. Reversible when possible - Include down migrations
  4. Tested before production - Run against production-like data
  5. Small and focused - One logical change per migration

Deployment Order

1. Deploy migration (expand)
2. Deploy new code
3. Verify functionality
4. Deploy cleanup migration (contract)

Zero-Downtime Patterns

Adding Columns

-- Safe: Add nullable column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Safe: Add column with default (PostgreSQL 11+)
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- Unsafe: Adding NOT NULL without default (table lock)
-- ALTER TABLE users ADD COLUMN phone VARCHAR(20) NOT NULL; -- DON'T DO THIS

-- Safe alternative: Multi-step approach
-- Step 1: Add nullable
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Step 2: Backfill data
UPDATE users SET phone = 'unknown' WHERE phone IS NULL;
-- Step 3: Add constraint
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;

Removing Columns

-- Step 1: Stop writing to column (code change)
-- Step 2: Deploy code that doesn't read column
-- Step 3: Drop column
ALTER TABLE users DROP COLUMN legacy_field;

Renaming Columns

-- DON'T: Direct rename causes downtime
-- ALTER TABLE users RENAME COLUMN name TO full_name;

-- DO: Multi-step approach
-- Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);

-- Step 2: Backfill and sync (trigger or application)
UPDATE users SET full_name = name WHERE full_name IS NULL;

CREATE OR REPLACE FUNCTION sync_name_columns()
RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' OR TG_OP = 'UPDATE' THEN
IF NEW.full_name IS NULL THEN
NEW.full_name := NEW.name;
END IF;
IF NEW.name IS NULL THEN
NEW.name := NEW.full_name;
END IF;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER sync_names
BEFORE INSERT OR UPDATE ON users
FOR EACH ROW EXECUTE FUNCTION sync_name_columns();

-- Step 3: Update code to use new column
-- Step 4: Drop trigger and old column
DROP TRIGGER sync_names ON users;
ALTER TABLE users DROP COLUMN name;

Changing Column Types

-- Safe: Widening (VARCHAR(50) to VARCHAR(100))
ALTER TABLE users ALTER COLUMN name TYPE VARCHAR(100);

-- Unsafe: Narrowing or incompatible types
-- Use multi-step approach:

-- Step 1: Add new column
ALTER TABLE products ADD COLUMN price_cents BIGINT;

-- Step 2: Backfill
UPDATE products SET price_cents = (price * 100)::BIGINT;

-- Step 3: Sync with trigger
CREATE TRIGGER sync_price
BEFORE INSERT OR UPDATE ON products
FOR EACH ROW EXECUTE FUNCTION sync_price_columns();

-- Step 4: Update code to use new column
-- Step 5: Drop old column
ALTER TABLE products DROP COLUMN price;
ALTER TABLE products RENAME COLUMN price_cents TO price;

Adding Indexes

-- Always use CONCURRENTLY to avoid table locks
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

-- If it fails, drop invalid index and retry
DROP INDEX IF EXISTS idx_users_email;
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);

Adding Constraints

-- Adding NOT NULL (safe in PostgreSQL 12+ with default)
ALTER TABLE users
ADD COLUMN status VARCHAR(20) DEFAULT 'active' NOT NULL;

-- Adding UNIQUE constraint
-- Step 1: Create index concurrently
CREATE UNIQUE INDEX CONCURRENTLY idx_users_email_unique ON users (email);

-- Step 2: Add constraint using index
ALTER TABLE users
ADD CONSTRAINT users_email_unique UNIQUE USING INDEX idx_users_email_unique;

-- Adding CHECK constraint (PostgreSQL 12+)
ALTER TABLE orders
ADD CONSTRAINT orders_amount_positive CHECK (amount > 0) NOT VALID;

-- Validate separately (doesn't lock)
ALTER TABLE orders VALIDATE CONSTRAINT orders_amount_positive;

Adding Foreign Keys

-- Step 1: Add constraint without validation
ALTER TABLE orders
ADD CONSTRAINT orders_customer_fk
FOREIGN KEY (customer_id) REFERENCES customers(id)
NOT VALID;

-- Step 2: Validate separately (no lock on parent table)
ALTER TABLE orders VALIDATE CONSTRAINT orders_customer_fk;

Rollback Strategies

Reversible Migrations

-- Migration: Add column
-- UP
ALTER TABLE users ADD COLUMN middle_name VARCHAR(100);

-- DOWN
ALTER TABLE users DROP COLUMN middle_name;

Data Preservation Rollback

-- UP: Rename column via copy
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
UPDATE users SET full_name = name;
-- Don't drop 'name' yet

-- DOWN: Just drop new column (data preserved in 'name')
ALTER TABLE users DROP COLUMN IF EXISTS full_name;

Feature Flags for Rollback

# Application-level rollback via feature flags
def get_user_name(user):
if feature_enabled('use_full_name_column'):
return user.full_name
return user.name

Migration Tools

Flyway (Java/Multi-language)

-- V1__create_users.sql
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email VARCHAR(255) NOT NULL UNIQUE,
created_at TIMESTAMP DEFAULT NOW()
);

-- V2__add_user_status.sql
ALTER TABLE users ADD COLUMN status VARCHAR(20) DEFAULT 'active';

-- R__refresh_user_view.sql (repeatable)
CREATE OR REPLACE VIEW active_users AS
SELECT * FROM users WHERE status = 'active';
# Run migrations
flyway migrate

# Check status
flyway info

# Rollback (Teams edition)
flyway undo

Alembic (Python/SQLAlchemy)

# alembic/versions/001_create_users.py
from alembic import op
import sqlalchemy as sa

revision = '001'
down_revision = None

def upgrade():
op.create_table(
'users',
sa.Column('id', sa.Integer(), primary_key=True),
sa.Column('email', sa.String(255), nullable=False, unique=True),
sa.Column('created_at', sa.DateTime(), server_default=sa.func.now())
)

def downgrade():
op.drop_table('users')
# Create migration
alembic revision -m "add_user_status"

# Auto-generate from models
alembic revision --autogenerate -m "add_user_status"

# Run migrations
alembic upgrade head

# Rollback one step
alembic downgrade -1

Prisma (TypeScript/JavaScript)

// schema.prisma
model User {
id Int @id @default(autoincrement())
email String @unique
status String @default("active")
createdAt DateTime @default(now())
}
# Create migration
npx prisma migrate dev --name add_user_status

# Apply to production
npx prisma migrate deploy

# Reset database
npx prisma migrate reset

Data Migrations

Large Table Updates

-- DON'T: Update all rows at once
-- UPDATE users SET status = 'active' WHERE status IS NULL;

-- DO: Batch updates
DO $$
DECLARE
batch_size INT := 10000;
affected INT;
BEGIN
LOOP
UPDATE users
SET status = 'active'
WHERE id IN (
SELECT id FROM users
WHERE status IS NULL
LIMIT batch_size
FOR UPDATE SKIP LOCKED
);

GET DIAGNOSTICS affected = ROW_COUNT;
EXIT WHEN affected = 0;

COMMIT;
PERFORM pg_sleep(0.1); -- Reduce load
END LOOP;
END $$;

ETL Pattern

# Python batch migration
def migrate_user_status(batch_size=10000):
while True:
with transaction():
users = db.execute("""
SELECT id FROM users
WHERE status IS NULL
LIMIT %s
FOR UPDATE SKIP LOCKED
""", [batch_size])

if not users:
break

ids = [u.id for u in users]
db.execute("""
UPDATE users SET status = 'active'
WHERE id = ANY(%s)
""", [ids])

time.sleep(0.1) # Backpressure

Testing Migrations

Pre-Production Testing

# 1. Restore production backup to staging
pg_restore -d staging_db production_backup.dump

# 2. Run migrations
flyway migrate -url=jdbc:postgresql://staging/staging_db

# 3. Run application tests
pytest tests/integration/

# 4. Verify schema matches expected
pg_dump -s staging_db > actual_schema.sql
diff expected_schema.sql actual_schema.sql

Migration Test Framework

# tests/test_migrations.py
import pytest
from alembic.config import Config
from alembic import command

@pytest.fixture
def alembic_config():
return Config("alembic.ini")

def test_upgrade_downgrade(alembic_config, test_db):
"""Test all migrations can upgrade and downgrade"""
# Upgrade to head
command.upgrade(alembic_config, "head")

# Downgrade to base
command.downgrade(alembic_config, "base")

# Upgrade again
command.upgrade(alembic_config, "head")

def test_migration_reversibility(alembic_config, test_db):
"""Test each migration is reversible"""
revisions = get_all_revisions(alembic_config)

for rev in revisions:
command.upgrade(alembic_config, rev)
command.downgrade(alembic_config, "-1")
command.upgrade(alembic_config, rev)

CI/CD Integration

GitHub Actions

name: Database Migrations

on:
pull_request:
paths:
- 'migrations/**'

jobs:
test-migrations:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: test
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5

steps:
- uses: actions/checkout@v4

- name: Run migrations
run: flyway migrate

- name: Test rollback
run: flyway undo

- name: Re-apply migrations
run: flyway migrate

- name: Verify schema
run: |
pg_dump -s testdb > actual.sql
diff expected_schema.sql actual.sql

Usage Examples

Plan Zero-Downtime Migration

Apply migration-strategy skill to plan zero-downtime column rename from 'name' to 'full_name' in users table

Large Data Migration

Apply migration-strategy skill to design batched migration for updating 50M rows with minimal performance impact

Setup Migration Pipeline

Apply migration-strategy skill to configure Alembic with automated testing and CI/CD integration

Success Output

When successful, this skill MUST output:

✅ SKILL COMPLETE: migration-strategy

Completed:
- [x] Migration plan created with step-by-step deployment order
- [x] Zero-downtime strategy validated (no table locks)
- [x] Rollback procedure documented and tested
- [x] Migration files generated (SQL, Alembic, Flyway, or Prisma)
- [x] Testing plan executed against production-like data
- [x] CI/CD pipeline configured for migration deployment

Outputs:
- migrations/ directory with versioned migration files
- MIGRATION-PLAN.md with deployment steps
- ROLLBACK-PLAN.md with reversal procedure
- CI/CD workflow configuration (.github/workflows/ or similar)
- Test results from staging environment

Completion Checklist

Before marking this skill as complete, verify:

  • Migration tested on production data snapshot in staging
  • All schema changes use CONCURRENTLY or non-locking equivalents
  • Backward compatibility verified (old code works with new schema)
  • Forward compatibility verified (new code works with old schema)
  • Rollback procedure documented with exact commands
  • Deployment order documented (migration → code → cleanup)
  • Data integrity constraints validated after migration
  • Performance impact measured (query time, table locks, replication lag)
  • Monitoring alerts configured for migration execution

Failure Indicators

This skill has FAILED if:

  • ❌ Migration causes table locks in production (missing CONCURRENTLY)
  • ❌ Old code breaks with new schema (backward incompatibility)
  • ❌ Rollback procedure not tested or doesn't work
  • ❌ Migration untested on production-like dataset
  • ❌ Data loss or corruption during migration
  • ❌ No deployment order documented (unclear how to deploy safely)
  • ❌ Migration timeout or performance degradation in production
  • ❌ Constraint violations discovered after migration completes
  • ❌ No monitoring for migration progress/failures

When NOT to Use

Do NOT use this skill when:

  • Working with development databases (just drop and recreate)
  • Prototyping without production data (schema evolution not critical)
  • Using schema-less databases (MongoDB, DynamoDB)
  • Migrating data between different database systems (use data-migration-patterns)
  • Migrating to cloud-native databases with different paradigms (use vendor-specific guides)
  • One-time data imports or exports (use ETL tools)
  • Schema changes in single-user applications (no zero-downtime requirements)

Anti-Patterns (Avoid)

Anti-PatternProblemSolution
Direct ALTER TABLE without CONCURRENTLYTable locks block productionUse CONCURRENTLY for indexes/constraints
Renaming columns directlyInstant breaking changeMulti-step: add → sync → migrate code → drop old
Backfilling all rows at onceLong table locksBatch updates with FOR UPDATE SKIP LOCKED
No rollback planCan't recover from failuresDocument and test rollback before deploying
Testing only on empty databaseMisses performance issuesTest on production-sized dataset
Ignoring replication lagReplicas fall behindMonitor lag, add backpressure
Running migrations manuallyHuman error, inconsistencyAutomate via CI/CD with Flyway/Alembic
Skipping feature flag strategyCan't roll back code changesUse feature flags for gradual rollout

Principles

This skill embodies:

  • #2 First Principles - Understand database locking and replication before migrating
  • #3 Keep It Simple - Small, incremental migrations over large, risky changes
  • #8 No Assumptions - Test migrations on production-like data
  • #10 Quality Over Speed - Zero-downtime migrations require careful planning
  • #11 Measure Twice, Cut Once - Test rollback before deploying forward migration
  • #13 Automation - Automate migration deployment to prevent human error

Full Principles: CODITECT-STANDARD-AUTOMATION.md