PostgreSQL Logical Replication, Step by Step
A practical, no-hand-waving walkthrough of setting up PostgreSQL logical replication — from a fresh source database to a working stream of changes — plus the gotchas nobody tells you about.
- postgresql
- replication
- operations
Logical replication is one of PostgreSQL’s most powerful — and most misunderstood — features. Get it right and you have the foundation for analytics replicas, multi-region disaster recovery, zero-downtime version upgrades, and live migrations. Get it wrong and you’ll be debugging WAL bloat at 2 AM.
This is a practical walkthrough of setting up native PostgreSQL logical replication between two databases. We’ll go from a fresh source to a working stream of changes flowing into a destination, then cover the gotchas nobody tells you about.
What logical replication actually is
PostgreSQL gives you two replication flavours:
- Physical (streaming) replication replicates the entire cluster at the byte level. Same major version on both sides, all-or-nothing, the standby is read-only and must mirror the primary exactly.
- Logical replication decodes the WAL into row-level changes and ships them to a subscriber. Per-table, per-database, can cross major versions, and the destination is a fully writable database in its own right.
Logical is what you want when:
- You need to replicate a subset of tables, not the whole cluster.
- The destination is on a different major version (e.g. PG 14 → PG 16 for a zero-downtime upgrade).
- The destination needs to be writable (analytics, multi-master, sharding).
- You want to transform or filter rows in flight (column lists and row filters in PG 15+).
It’s not what you want when you need exact byte-for-byte redundancy — use physical replication for that.
Prerequisites
Both source and destination must be PostgreSQL 10 or newer; PG 14+ is strongly recommended (subscription resilience and the binary copy path improved significantly through PG 16).
You’ll need:
- Source: superuser access (or a role that can create publications and a
REPLICATIONrole). - Destination: a database with the same schema for the tables you’re replicating.
- Network: the destination must reach the source on PG’s port (5432 by default).
- Disk headroom on the source: WAL is retained until the slot catches up. Plan for at least 10× your average WAL/hour rate as a buffer.
Step 1 — Configure the source
Edit postgresql.conf on the source:
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10
wal_sender_timeout = 60s
Then restart the source — wal_level only takes effect on restart, not reload.
In pg_hba.conf, allow replication connections from the destination:
host all pgpipe_repl 10.0.0.0/8 scram-sha-256
Reload (SELECT pg_reload_conf();) and verify:
SHOW wal_level; -- should print 'logical'
SHOW max_replication_slots;
Step 2 — Create a replication role
Don’t replicate as a superuser. Create a dedicated role with the minimum privileges:
CREATE ROLE pgpipe_repl WITH LOGIN REPLICATION PASSWORD 'redacted';
GRANT USAGE ON SCHEMA public TO pgpipe_repl;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO pgpipe_repl;
-- Grants for tables created in the future:
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO pgpipe_repl;
REPLICATION is the privilege that authorises the role to open a replication connection. SELECT is needed because the initial table copy is done via COPY, not through WAL decoding.
Step 3 — Create the publication
A publication is the source-side declaration of “here are the tables I’m exposing for replication.”
-- Three named tables:
CREATE PUBLICATION analytics_pub FOR TABLE
public.orders,
public.customers,
public.line_items;
Or for an entire database (requires superuser):
CREATE PUBLICATION analytics_pub FOR ALL TABLES;
Publications can include row filters and column lists (PG 15+):
CREATE PUBLICATION fraud_pub FOR TABLE
public.transactions (id, amount, country, created_at)
WHERE (country IN ('US', 'GB', 'DE'));
Inspect what’s published:
SELECT * FROM pg_publication_tables;
Step 4 — Prepare the destination
The destination needs the same schema for the replicated tables (logical replication doesn’t replicate DDL — see “Gotchas” below). Easiest way is pg_dump --schema-only:
pg_dump -h source-host -U postgres -s \
-t public.orders -t public.customers -t public.line_items \
sourcedb | psql -h dest-host -U postgres destdb
After this, the destination has empty tables matching the source.
Step 5 — Create the subscription
On the destination:
CREATE SUBSCRIPTION analytics_sub
CONNECTION 'host=source-host dbname=sourcedb user=pgpipe_repl password=...'
PUBLICATION analytics_pub
WITH (
copy_data = true, -- snapshot existing rows on first run
streaming = on -- stream long-running transactions (PG 14+)
);
PostgreSQL will:
- Open a connection to the source.
- Create a logical replication slot named
analytics_subon the source. - Take a snapshot,
COPYall existing rows in the published tables. - Switch to streaming mode and apply changes as they arrive.
Step 6 — Verify the pipeline
On the source:
SELECT slot_name, plugin, slot_type, active,
confirmed_flush_lsn, restart_lsn,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn)) AS lag_bytes
FROM pg_replication_slots;
active = true and lag_bytes close to zero means the pipeline is healthy.
On the destination:
SELECT subname, subenabled
FROM pg_subscription;
SELECT subname, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;
Now insert a row on the source:
-- on source:
INSERT INTO public.customers (email) VALUES ('alice@example.com');
Within a few seconds, on the destination:
SELECT * FROM public.customers WHERE email = 'alice@example.com';
You should see Alice. Welcome to streaming.
Gotchas (the part nobody tells you)
1. DDL is not replicated
Run ALTER TABLE customers ADD COLUMN phone TEXT; on the source and the subscription will break the moment a row carrying that column flows through. You must apply DDL on the destination first, then on the source.
This is the single most common cause of “logical replication broke after a deploy” tickets.
2. Replica identity for UPDATE/DELETE
If a published table has no primary key, UPDATE and DELETE fail to replicate unless you set:
ALTER TABLE my_table REPLICA IDENTITY FULL;
FULL makes every UPDATE/DELETE log the full old row in the WAL — expensive on wide tables. Better: add a primary key.
3. Slot retention can fill your disk
If the destination is offline for hours, the source retains WAL until the slot catches up — there is no time-based eviction by default. Always monitor pg_replication_slots lag and alert.
To bound retention (PG 13+):
ALTER SYSTEM SET max_slot_wal_keep_size = '50GB';
SELECT pg_reload_conf();
The slot is dropped if the lag exceeds 50 GB — better to lose the slot than to fill the disk.
4. Disabling a subscription doesn’t free the slot
ALTER SUBSCRIPTION ... DISABLE; stops the destination from pulling, but the slot on the source keeps retaining WAL. To fully tear down:
-- destination:
ALTER SUBSCRIPTION analytics_sub DISABLE;
ALTER SUBSCRIPTION analytics_sub SET (slot_name = NONE);
DROP SUBSCRIPTION analytics_sub;
-- source (only if the subscription couldn't drop the slot):
SELECT pg_drop_replication_slot('analytics_sub');
5. Sequences don’t replicate
Logical replication ships row data, not sequence advances. After a cutover, you’ll need to manually setval() sequences on the new primary or your INSERTs will collide with existing IDs.
6. Large transactions can stall the apply worker
Before PG 14, a single multi-million-row transaction had to fully decode on the source before any of it was sent. PG 14 added streaming mode (streaming = on); PG 16 added parallel apply. Use the newest version you can.
Where native logical replication breaks down
Native logical replication is solid, but it has limits:
- No DDL replication. Every schema change requires manual coordination.
- No web UI. All operational state lives in
pg_*catalogue views. - No dead-letter queue. A bad row stops the pipeline and you debug from
pg_subscription_rel. - No schema remapping. The destination schema name must match the source.
- No first-class observability. You read lag from
pg_replication_slots; metrics for application teams require glue.
That’s why we built pgpipe. It uses the same logical decoding mechanism under the hood, but adds:
- A web dashboard with a setup wizard
- Automatic destination schema setup on first run (runtime DDL replication is on the roadmap, not yet shipped)
- A dead-letter queue with REST API for inspection and replay
- Schema remapping for multi-tenant and blue-green deployments
- 15 Prometheus metrics out of the box
- Apply throughput on par with native, with strict ordering preserved
Logical replication is the right primitive. pgpipe makes it production-grade without you writing the operational glue. Try it with Docker — under a minute to see replication working.
Need a hand setting up logical replication on a real production system? PG Horizon does this for clients every week — get in touch.
Need a hand with PostgreSQL?
We design, tune, and operate PostgreSQL for teams that can't afford downtime.
Talk to us