Database Engineering & Performance
Scaling PostgreSQL for High-Traffic: Read Replicas, Sharding, and Connection Pooling Strategies
How to Scale PostgreSQL for High-Traffic Applications: Read Replicas and Sharding
PostgreSQL scales vertically until hardware limits are reached. For high-traffic applications, horizontal scaling through read replicas and sharding distributes load across multiple servers. This guide covers implementation strategies for both approaches.
Read Replicas
Read replicas offload SELECT queries from the primary database using streaming replication. The primary ships Write-Ahead Log (WAL) records to replicas, maintaining near real-time synchronization.
Streaming Replication Setup
Configure the primary server in postgresql.conf:
wal_level = replica
max_wal_senders = 10
wal_keep_size = 1GB
max_replication_slots = 10
Create a replication user and replication slot:
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'secure_password';
SELECT pg_create_physical_replication_slot('replica1_slot');
Add to pg_hba.conf on the primary:
host replication replicator 192.168.1.0/24 scram-sha-256
Initialize the replica using pg_basebackup:
pg_basebackup -h primary_host -D /var/lib/postgresql/data -U replicator -P -v -R -S replica1_slot
The replica's standby.signal file and primary_conninfo in postgresql.conf maintain the connection. Replication slots prevent WAL retention issues during extended replica disconnections.
Replication Lag Considerations
Replicas may lag behind the primary due to network latency or heavy write workloads. Monitor lag using:
SELECT
pg_wal_lsn_diff(pg_current_wal_lsn(), pg_last_wal_receive_lsn()) AS lag_bytes,
pg_wal_lsn_diff(pg_current_wal_lsn(), pg_last_wal_replay_lsn()) AS replay_lag_bytes;
For applications requiring strong consistency, route writes and critical reads to the primary.
Sharding
Sharding distributes data across multiple physical servers based on a shard key. Each shard handles a subset of data, enabling horizontal scaling beyond single-server limits.
Native Partitioning
PostgreSQL declarative partitioning splits a table into smaller physical pieces on a single node:
CREATE TABLE events (
id SERIAL,
event_time TIMESTAMP,
user_id INTEGER,
data JSONB
) PARTITION BY RANGE (event_time);
CREATE TABLE events_2025_q1 PARTITION OF events
FOR VALUES FROM ('2025-01-01') TO ('2025-04-01');
CREATE TABLE events_2025_q2 PARTITION OF events
FOR VALUES FROM ('2025-04-01') TO ('2025-07-01');
Partition pruning optimizes queries by scanning only relevant partitions. Native partitioning operates on a single node and does not provide horizontal scaling across multiple servers.
Distributed Sharding with Citus
Citus extends PostgreSQL with distributed query execution across multiple nodes:
-- Create distributed table
SELECT create_distributed_table('users', 'user_id');
-- Create reference table (replicated to all nodes)
SELECT create_reference_table('countries');
-- Query across shards transparently
SELECT u.name, c.name
FROM users u
JOIN countries c ON u.country_id = c.id
WHERE u.user_id = 12345;
Citus colocates related data on the same shard to minimize cross-node joins.
Shard Key Selection
Choose shard keys that:
- Distribute data evenly across nodes
- Minimize cross-shard queries
- Align with query patterns (e.g.,
user_idfor user-scoped data)
Avoid monotonically increasing keys like timestamps, which cause hotspots. Use UUIDs or hash-based distribution for even data spread.
Connection Pooling
High-traffic applications require connection pooling to prevent connection exhaustion. PostgreSQL's process-based architecture creates a new process per connection, consuming significant memory.
PgBouncer Configuration
Install and configure PgBouncer in pgbouncer.ini:
[databases]
postgres = host=primary_host port=5432
[pgbouncer]
pool_mode = transaction
max_client_conn = 10000
default_pool_size = 100
reserve_pool_size = 10
reserve_pool_timeout = 3
Transaction pooling mode returns connections to the pool after each transaction, maximizing concurrency. Transaction pooling breaks prepared statements, advisory locks, and LISTEN/NOTIFY. Use session pooling mode if your application relies on these features.
PgCat for Sharded Environments
PgCat provides automatic sharding and load balancing:
[pgcat]
pool_mode = transaction
max_pool_size = 100
[[databases]]
name = "sharded_db"
users = ["app_user"]
[[databases.shards]]
name = "shard_1"
hosts = ["shard1.example.com:5432"]
[[databases.shards]]
name = "shard_2"
hosts = ["shard2.example.com:5432"]
Load Balancing
Distribute read traffic across replicas using HAProxy:
frontend postgres_read
bind *:5433
mode tcp
default_backend postgres_read_backends
backend postgres_read_backends
mode tcp
balance roundrobin
option tcp-check
server replica1 192.168.1.11:5432 check
server replica2 192.168.1.12:5432 check
Route writes to the primary, reads to replicas. Use health checks to automatically remove failed nodes.
Backup Strategy
Replicated and sharded environments require coordinated backup strategies. Use tools like pgBackRest, Barman, or WAL-G for physical backups with point-in-time recovery. Back up each shard independently and ensure replication slots are included in backup procedures. Test restore procedures regularly.
Getting Started
- Baseline optimization: Set
shared_buffersto 25% of system RAM,work_memto 4-64MB per connection based on query complexity, andmaintenance_work_memto 512MB-2GB before scaling - Add read replicas: Start with 2-3 replicas for read-heavy workloads
- Implement connection pooling: Deploy PgBouncer or PgCat before traffic spikes
- Monitor replication lag: Set alerts for lag exceeding 1 second
- Plan sharding early: Design shard keys based on data access patterns
- Gradual migration: Move to sharding when single-node performance plateaus despite vertical scaling
Share this Guide:
More Guides
API Gateway Showdown: Kong vs Ambassador vs AWS API Gateway for Microservices
Compare Kong, Ambassador, and AWS API Gateway across architecture, performance, security, and cost to choose the right gateway for your microservices.
12 min readGitHub Actions vs GitLab CI vs Jenkins: The Ultimate CI/CD Platform Comparison for 2026
Compare GitHub Actions, GitLab CI, and Jenkins across architecture, scalability, cost, and security to choose the best CI/CD platform for your team in 2026.
7 min readKafka vs RabbitMQ vs EventBridge: Complete Messaging Backbone Comparison
Compare Apache Kafka, RabbitMQ, and AWS EventBridge across throughput, latency, delivery guarantees, and operational complexity to choose the right event-driven architecture for your use case.
4 min readChaos Engineering: A Practical Guide to Failure Injection and System Resilience
Learn how to implement chaos engineering using the scientific method: define steady state, form hypotheses, inject failures, and verify system resilience. This practical guide covers application and infrastructure-level failure injection patterns with code examples.
4 min readMastering AI Model Deployment: Blue-Green, Canary, and A/B Testing Strategies
Learn three essential deployment patterns for ML models—Blue-Green, Canary, and A/B Testing—with practical examples on traffic routing, rollback mechanisms, and infrastructure requirements.
3 min readContinue Reading
API Gateway Showdown: Kong vs Ambassador vs AWS API Gateway for Microservices
Compare Kong, Ambassador, and AWS API Gateway across architecture, performance, security, and cost to choose the right gateway for your microservices.
12 min readGitHub Actions vs GitLab CI vs Jenkins: The Ultimate CI/CD Platform Comparison for 2026
Compare GitHub Actions, GitLab CI, and Jenkins across architecture, scalability, cost, and security to choose the best CI/CD platform for your team in 2026.
7 min readKafka vs RabbitMQ vs EventBridge: Complete Messaging Backbone Comparison
Compare Apache Kafka, RabbitMQ, and AWS EventBridge across throughput, latency, delivery guarantees, and operational complexity to choose the right event-driven architecture for your use case.
4 min read