PostgreSQL Development - The Declarative Way
Introduction
It was a typical Friday afternoon deployment. Unfortunately, two developers had changed the same PostgreSQL function, both adding them as sequential migration files to main that we didn’t notice. Now we had to manually reconcile what the correct merged version should be before we could proceed. Again. Something had to change, our team of 4 developers needed a better way to work together without stepping on each other’s toes.
At Infinitas Learning we create educational content in various European countries. This educational content is made available to teachers and learners via printed books as well as via our digital platforms.
As a company we are converging the number of digital platforms that we have into a couple of focussed ones. This also led us to building a new system to capture the outcomes of those digital interactions into one system. Given that we’re one of the largest educational publishers in Europe, we’re talking about a lot of outcomes, spanning in the millions per day.
In the Results & Insights team we are responsible for building and maintaining that system. Initially we started out with a textbook implementation of a highly available, distributed, real-time, streaming data system using Apache Kafka, several topics, streaming Java apps, Redis and PostgreSQL. As our journey progressed we found out that, although these systems are awesome, we could simplify the architecture and reduce cost significantly by just using PostgreSQL only, putting us pretty much in camp Postgres for Everything.
This decision meant heavy database development. After six months of declarative schema management, we’ve cut schema change time from hours to minutes while eliminating 80% of merge conflicts. With nearly all our business logic now residing in PostgreSQL, we continuously make changes to custom defined functions, views and table schemas. Like many teams we started out with migrations, but at scale, we faced (probably) familiar challenges:
- Migration scripts piling up: Each change added another numbered file to review and merge
- Merge conflicts: Multiple developers working on schema changes led to conflicts in sequential migration files
- Difficult reviews: Reviewing incremental changes without seeing the before and after state was error-prone
- Testing complexity: Running the full migration chain was slow and fragile
- Production anxiety: Every deployment was a multi-step dance with potential for human error
This is how we transformed our workflow to solve these challenges: using declarative schema management, unit testing, and automated deployments.
Declarative Schema Management
What Is Declarative Schema Management?
Traditional database development uses a versioned workflow: you write sequential migration scripts (001-add-users-table.sql, 002-add-email-column.sql, etc.) that describe the steps to transform your database from one state to another.
We switched to a declarative workflow: we describe what the database should look like in its final state, and let tools generate the migration automatically by comparing states.
Here’s the key difference:
Traditional Approach:
-- migrations/001_create_users.sql
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
-- migrations/002_add_email.sql
ALTER TABLE users ADD COLUMN email TEXT;
-- migrations/003_make_email_required.sql
ALTER TABLE users ALTER COLUMN email SET NOT NULL;
Declarative Approach:
-- schema/tables/users.sql
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL
);
This target state is compared against the current production database when we want to deploy, and the migration is generated automatically.
Our Schema Organization
Our PostgreSQL schema lives in .sql files organized by logical domain:
postgres/schema/
├── setup.sql # Main entry point (dependency-ordered)
├── extensions.sql # PostgreSQL extensions
├── roles.sql # Top level user roles and permissions
├── database_settings.sql # DB-level configuration
├── results/ # Results schema (our core domain)
│ ├── object_types/ # Custom types and enums
│ │ ├── learning_outcome_type.sql
│ │ ├── attempt_state.sql
│ ├── tables/ # Table definitions
│ │ ├── learning_outcomes.sql
│ │ ├── attempts.sql
│ │ ├── assignment_metadata.sql
│ ├── partitions/ # Partition configurations
│ │ ├── learning_outcomes_validated.sql
│ ├── functions/ # Internally used functions
│ │ ├── validate_learning_outcome.sql
│ │ ├── is_valid_learning_outcome_transition.sql
│ ├── views/ # Views
│ │ ├── assignment_outcomes.sql
├── public/ # Public API schema used by our platforms
│ └── functions/ # Public-facing functions
└── maintenance/ # Maintenance schema
Each file describes the complete, current definition of that object. Want to add a column? Just edit the table file directly. Need to change a function? Update the function file. No more hunting through numbered migrations to understand what happened where and what is the current state.
Safety First
Of course, having your entire schema as editable .sql files creates a new risk: what if someone accidentally runs these files against production?
Every schema file is wrapped in a safety check to prevent execution on staging or production accidentally:
DO $SAFETY_CHECK$
BEGIN
IF EXISTS (SELECT 1 FROM pg_namespace WHERE nspname = 'is_not_staging_or_production') THEN
-- Table/function/view definition here
-- This namespace exists only in development and CI containers
-- If you try to run this on staging or production, the check fails silently
-- This prevents accidentally running "CREATE TABLE" statements on live data
END IF;
END
$SAFETY_CHECK$;
This namespace only exists in local development and CI containers, which prevents applying these declarative files directly to live databases. We learned this lesson the hard way when a developer accidentally ran a schema file against staging during debugging. No data was lost, but we spent an anxious hour verifying that everything was intact. The safety check ensures it can’t happen again.
The Benefits We’ve Seen
After half a year with this approach, we’ve seen the clear benefits:
- Easier to understand: New team members can read the schema files and immediately understand what the database looks like
- Simpler reviews: PRs show the actual change to the schema, not a script that performs the change. We found it very difficult previously to focus on “what changed” as, especially with functions, the full code block with a few lines changed was part of the migration, making it very hard to understand the actual changes made
- Fewer merge conflicts: Multiple developers can work on different tables without sequential migration conflicts and, in case both did work on the same table/view/function, the merge is significantly easier (see also the previous bullet)
- Better tooling: Declarative schemas work beautifully with modern diff-based tools like pg-schema-diff
- Faster iteration: Want to test something out? Just edit the file and redeploy to your local container
Making the Transition
You might be wondering: how do you migrate from a migrations-based approach to declarative schemas?
For us, the transition was pretty straightforward:
- Snapshot current state: We generated the complete schema from our production database using
pg_dump --schema-only - File organization: We split the monolithic dump into our logical file structure (tables, functions, views, etc.)
- Add safety checks: Wrapped each file in the safety check mechanism
- Keep migration history: We archived our existing migrations for reference but stopped creating new ones
- One-time investment: The whole process took about a week of developer time
The key insight: you don’t need to convert your existing migration history. Because we diff against the current production database (not an empty one), you can just simply continue from where you’ve left off. Once you have a declarative snapshot of your current state, that state becomes your new baseline. Future changes are made declaratively, and migrations are generated automatically by comparing against production from that point forward.
Deployment
With schemas defined as .sql files on disk, eventually you need to ship them to your production database.
Automated Migration Generation
Here’s where declarative schema management really shines. We use GitHub Actions to automatically generate migrations:
Workflow: Create Release Script
- Spin up local environment: GitHub Actions uses our custom CLI to create a PostgreSQL container
- Deploy target schema: Apply all our declarative
.sqlfiles to the container - Generate diff: Use pg-schema-diff to compare local schema against staging/production
- Create Sqitch release: Generate a script that we can deploy using Sqitch
- Raise a PR: Automated PR with the generated migration for human review
The key command:
pg-schema-diff plan \
--from-dsn "postgresql://staging-database" \
--to-dsn "postgresql://local-container" \
--disable-plan-validation \
--include-schema=maintenance \
--include-schema=public \
--include-schema=results > schema_diff.sql
This generates a migration script that, when executed, will transform the staging/production database into our desired target state.
Example generated migration:
/*
Statement 0
- HAS_UNTRACKABLE_DEPENDENCIES: Dependencies, i.e. other functions used in the function
body, of non-sql functions cannot be tracked. As a result, we cannot guarantee that
function dependencies are ordered properly relative to this statement. For adds, this
means you need to ensure that all functions this function depends on are created/altered
before this statement.
*/
SET SESSION statement_timeout = 3000;
SET SESSION lock_timeout = 3000;
CREATE OR REPLACE FUNCTION results.deduplicate()
RETURNS void
LANGUAGE plpgsql
SECURITY DEFINER
AS $function$
...
$function$;
Validation Pipeline
Before any migration reaches production, it goes through multiple validation stages:
1. Manual Review of Generated Migration (release PR)
- Automated workflow creates PR with the generated migration script
- Team reviews the SQL statements for correctness and safety
- Check for breaking changes, performance implications, and edge cases
- This human check is crucial: automated tools generate the migration, but actual developers validate if it makes sense
2. PR Validation (on every pull request)
After the PR is created, automated unit tests run in parallel.
- name: Run tests with Odin
run: |
cd postgres
./odin setup ci
This runs full schema deployment + all pgTAP tests for every PR.
3. Automated Release Validation (on release PRs)
- Creates temporary Neon branch from production
- Applies the migration via Sqitch
- Reports success/failure in PR checks
- Automatically cleans up the branch
4. Staging Deployment
- Deploy via Sqitch
- Monitor for issues
5. Production Deployment
- Final review and approval
- Deploy via Sqitch during maintenance window
- Monitor and rollback if needed
Why Sqitch for Deployment?
Wait, declarative schemas but versioned deployments? Isn’t that contradictory?
Not quite. We use declarative schemas for development (how we write and maintain code) but Sqitch for deployment (how we track what’s actually running in production). Here’s why:
- Version tracking: It tracks what’s deployed where
- CI/CD integration: Works seamlessly with our pipeline
- Rollback capability: Provides revert scripts (though we rarely use them, more on that in a moment)
- Database state tracking: Stores deployment history in the database itself
Our sqitch.plan file tracks releases:
v2025-37.4/release-script 2025-09-12T14:10:46Z # Schema release from production comparison
@v2025-37.4 2025-09-12T14:10:46Z
v2025-39.1/release-script 2025-09-28T13:15:34Z # Schema release from production comparison
@v2025-39.1 2025-09-28T13:15:34Z
v2025-40.1/release-script 2025-10-03T14:23:15Z # Added outcome metadata tracking
@v2025-40.1 2025-10-03T14:23:15Z
About rollbacks: We rarely use Sqitch’s revert scripts because our philosophy is “fix forward.” Database rollbacks are risky as you might lose data written after the migration. Instead, if something goes wrong, we quickly deploy a fix. Our multi-stage validation (above) means production issues are extremely rare.
Key Principles That Guide Us
Through this journey, we’ve developed principles that guide our work:
1. Safety above all
“Prefer additive changes when possible. Avoid destructive actions (like dropping columns) unless required and pre-agreed.”
We bias toward safe changes:
- Add columns with defaults instead of requiring backfills
- Create new functions instead of modifying existing ones when breaking compatibility
- Use
CREATE INDEX CONCURRENTLYfor zero-downtime index creation - Always have a rollback plan (even if we rarely use it)
2. Testing Is Non-Negotiable
Every schema change requires corresponding tests. No exceptions. We’ve been saved multiple times by catching issues in tests before they reached production.
We use pgTAP for extensive database testing, and we’ve also added performance benchmarks that test query performance on representative data volumes. For example, we verify that critical queries individually complete in under 1ms each when inserting millions of rows. This approach recently helped us to partition our database, but more on that in a future blog post.
3. Document Decisions, Not Just Code
Our documentation includes:
- What we built (code and schema)
- Why we made specific choices (decision rationale)
- Alternatives we considered (trade-off analysis)
- Lessons learned (what went wrong and how we fixed it)
We strive to write detailed technical documentation that not just explains how to do something, but why we chose that approach. Future team members (and future us!) need to understand the reasoning, not just follow instructions blindly.
4. Invest in Developer Experience
Building our CLI tool took time, but it pays dividends every day:
- Faster local iteration
- Consistent environments
- Less cognitive load
- Nice terminal UI makes work more enjoyable
5. Continuous Improvement
From our documentation:
“We treat this workflow as a living way of working. We regularly review how it’s working in practice and evolve based on developer feedback, incidents, tooling improvements, and product needs. If something feels slow, confusing, or risky: speak up.”
This workflow isn’t perfect, and it won’t stay static. We improve it continuously as we learn.
Lessons Learned
After half a year with this approach, here is what we’ve learned:
What’s Working Well
Developer Productivity: Schema changes that used to take hours (write migration, test locally, handle conflicts) now take 5-10 minutes (edit file, run tests).
Code Review Quality: Reviewers can see the actual change to the schema, not just the migration script. This has caught bugs that would have slipped through before.
Merge Conflicts: Dramatically reduced. Multiple developers can work on different parts of the schema without touching the same migration files or even working on the same file and merge with ease.
Testing Confidence: Extensive pgTAP tests have caught multiple issues before production. We can refactor with confidence.
Deployment Safety: Multi-stage validation (PR tests, release validation, staging deployment) gives us confidence in production deployments.
What We’d Do Differently
Start With Better Tooling Earlier
We spent months with no automation, then some complex bash scripts before building our CLI. The investment in proper tooling paid off quickly, we should have done it sooner. The developer experience improvement alone justified the time spent.
Establish Performance Baselines
We’ve added performance testing after we noticed some regression. It would have been helpful to catch the issues much earlier by including performance benchmarks from day one.
Document As We Go
Some of our best documentation was written in hindsight after we reflected on decisions. Writing it during the investigation would have been more efficient and probably more detailed.
Build Test Coverage From Day One
While we had tests, we didn’t have great coverage initially. The tests we added later caught several bugs, having them from the start would have prevented some production issues. Our current policy: no schema change without a corresponding test.
When This Approach Works
(And When It Doesn’t)
This Approach Works Well When:
- Multiple developers making frequent schema changes
- Teams that value PR review quality and code clarity
- Organizations willing to invest in tooling and detailed testing
- Databases where diffing is safe (read-only comparison against production)
- You have a regular release cadence that benefits from automation
This Approach May Not Fit When:
- Single developer or no frequent schema changes (usually isn’t worth the overhead)
- Complex data migrations requiring custom procedural logic (we do those separately, not (fully) via declarative changes)
- Strict change control requiring explicit approval of every SQL statement
- Legacy constraints make switching too costly
- Your database structure changes rarely but data migrations are frequent
Conclusion: What We’d Tell Other Teams
If you’re considering a similar approach, here’s our advice:
Invest in Testing
Good tests are the foundation. Without tests, declarative schemas are risky because you’re trusting auto-generated migrations. With tests, you have more confidence.
Choose the Right Tools
You don’t have to build custom tooling to begin. Use what already exists:
- GitHub Actions: CI/CD for validations
- pgTAP: Battle-tested PostgreSQL unit testing framework
- pg-schema-diff: Our choice for generating migrations (by Stripe)
- Sqitch: For version-tracked deployments
We built our custom CLI for our specific needs (container management, performance testing, rich UI), but most teams can probably get started with just the existing tools.
Document Your Decisions
Future you will thank present you. Write down:
- Why you chose specific approaches
- Trade-offs you considered
- Experiments you ran
- Lessons you learned
Build for Your Context
Our solution works for us because:
- Our database is our teams core product
- We have weekly releases (predictable cadence)
- We operate in educational domain (school year cycles matter)
- We value developer experience highly
- Mostly all of our business logic lives in the database
Your context is different, so adapt accordingly.
Resources
Our Stack
- Database: PostgreSQL 17
- Container Runtime: Podman (preferred) or Docker
- Schema Diff: pg-schema-diff
- Deployment: Sqitch
- Testing: pgTAP
- Hosting: Neon
- CI/CD: GitHub Actions
Contact
Have questions or want to share your own declarative database journey? Reach out: we’d love to hear from other teams tackling similar challenges.
Written by