Production Database Connection Monitoring Guide¶
TL;DR: Are we at risk?¶
No. The test database error (database is being accessed by other users) will NOT occur in production because:
1. You never drop the production database (error only happens on DROP DATABASE)
2. Production uses connection pooling with health checks
3. No multi-threaded operations in production code
4. Current usage: 3/100 connections (~3% utilization)
Current Production Health: ✅ EXCELLENT¶
Database Connections: 3/100 (3% utilization)
├── Active: 1
└── Idle: 2
Gunicorn Workers: 2 workers × 2 threads = 4 max connections
Celery Services: ~3 connections (worker + beat + flower)
Connection Management: Fully configured with safeguards
What Makes Production Different from Tests¶
| Factor | Tests (Had Issues) | Production (Safe) |
|---|---|---|
| Threading | ThreadPoolExecutor, 10-20 threads | 2 workers × 2 threads only |
| Connection Pattern | Rapid create/destroy | Pooled, reused |
| Database Operations | DROP DATABASE on teardown | Normal CRUD only |
| Connection Lifetime | CONN_MAX_AGE = 0 | CONN_MAX_AGE = 60s |
| Health Checks | Disabled | Enabled ✅ |
| Worker Restarts | N/A | Every 500 requests ✅ |
Built-in Safeguards Already Configured¶
1. Connection Pooling (settings.py:338-339)¶
"CONN_MAX_AGE": 60, # Connections recycled every 60s
"CONN_HEALTH_CHECKS": True, # Auto-cleanup broken connections
2. Database Timeouts (settings.py:334-335)¶
"connect_timeout": 10, # Can't hang forever on connect
"statement_timeout": 30000, # Queries timeout after 30s
3. Worker Recycling (gunicorn_config.py:22-24)¶
max_requests = 500 # Workers restart after 500 requests
max_requests_jitter = 50 # Prevents all workers restarting at once
4. Resource Limits¶
PostgreSQL max_connections: 100
Gunicorn max concurrent: 4 (2 workers × 2 threads)
Current utilization: 3% ✅
Headroom: 97 connections available
Monitoring Commands¶
Check Current Connection Status¶
# Quick overview
docker compose exec db psql -U user -d piquetickets -c "
SELECT count(*) as total, state
FROM pg_stat_activity
WHERE datname = 'piquetickets'
GROUP BY state;"
# Detailed view
docker compose exec db psql -U user -d piquetickets -c "
SELECT
pid,
usename,
application_name,
client_addr,
state,
state_change,
query_start,
NOW() - query_start AS duration,
LEFT(query, 50) as query_preview
FROM pg_stat_activity
WHERE datname = 'piquetickets'
ORDER BY query_start DESC;"
Check for Connection Leaks¶
# Find idle connections older than 5 minutes
docker compose exec db psql -U user -d piquetickets -c "
SELECT
pid,
usename,
state,
NOW() - state_change AS idle_duration,
application_name
FROM pg_stat_activity
WHERE datname = 'piquetickets'
AND state = 'idle'
AND NOW() - state_change > interval '5 minutes';"
Check Connection Pool Utilization¶
# Calculate utilization percentage
docker compose exec db psql -U user -d postgres -c "
SELECT
(SELECT count(*) FROM pg_stat_activity WHERE datname = 'piquetickets') as current,
(SELECT setting::int FROM pg_settings WHERE name = 'max_connections') as max_connections,
ROUND(
100.0 * (SELECT count(*) FROM pg_stat_activity WHERE datname = 'piquetickets')::numeric /
(SELECT setting::int FROM pg_settings WHERE name = 'max_connections')::numeric,
2
) as utilization_percent;"
When to Be Concerned¶
🟢 Normal (Current State)¶
- ≤ 20 connections (< 20% utilization)
- All workers healthy
- No idle connections > 10 minutes
- Response times normal
🟡 Monitor Closely¶
- 20-50 connections (20-50% utilization)
- Some idle connections > 10 minutes
- Response times slightly elevated
- Action: Check for connection leaks, review slow queries
🔴 Alert / Investigate¶
-
50 connections (> 50% utilization)
- Many idle connections > 10 minutes
- Connection pool near exhaustion
- Response times degraded
- Action:
- Identify connection source:
SELECT application_name, count(*) FROM pg_stat_activity GROUP BY application_name; - Check for runaway Celery tasks
- Consider increasing max_connections temporarily
- Review recent deployments
🚨 Critical¶
-
90 connections (> 90% utilization)
- New connections failing
- Service degradation
- Action:
- Terminate idle connections:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND state_change < NOW() - interval '10 minutes'; - Restart Celery workers:
docker compose restart celery_worker - Restart API:
docker compose restart api - Scale up if needed
Connection Math (Expected Usage)¶
Normal Load¶
Gunicorn (API): 2-4 connections (workers × threads)
Celery Worker: 1-2 connections (per worker)
Celery Beat: 1 connection
Celery Flower: 1 connection
Django Admin: 0-2 connections (when in use)
Automated Tasks: 1-3 connections (periodic)
─────────────────
Total Expected: 6-13 connections ✅
Current Actual: 3 connections ✅
Safety Margin: 87 connections available
High Load (e.g., ticket sale rush)¶
Gunicorn (API): 4 connections (all threads active)
Celery Worker: 2-3 connections (processing tasks)
Celery Beat: 1 connection
Celery Flower: 1 connection
Concurrent Checkouts: 5-10 connections (temporary spikes)
─────────────────
Peak Expected: 13-19 connections ✅
Still well under limit of 100
Celery-Specific Considerations¶
Your Celery is configured with:
Celery Connection Best Practices¶
- Task timeouts configured: Prevents hung tasks from holding connections ✅
- Result backend = Redis: Celery results don't use DB connections ✅
- Beat scheduler = Database: Uses 1 persistent connection ✅
Potential Celery Issues (To Watch)¶
- Long-running tasks: Can hold DB connections longer than expected
- Monitor:
celery -A brktickets inspect active - Task failures: Failed tasks should release connections, but verify
- Monitor:
celery -A brktickets inspect stats
Logging Connection Issues¶
Add this to monitor connection pool in production:
# In settings.py, add this to LOGGING configuration
'loggers': {
'django.db.backends': {
'level': 'WARNING', # Set to 'DEBUG' temporarily to see all queries
'handlers': ['console'],
},
}
If You Ever Need to Increase max_connections¶
# Connect to PostgreSQL
docker compose exec db psql -U user -d postgres
# Check current setting
SHOW max_connections;
# To increase (requires PostgreSQL restart):
ALTER SYSTEM SET max_connections = 200;
# Restart PostgreSQL
docker compose restart db
Note: Each connection uses ~10MB RAM. Current 100 limit = ~1GB RAM for connections.
Summary¶
✅ Your production setup is well-configured - Connection pooling with health checks enabled - Timeouts prevent runaway connections - Workers restart regularly to prevent leaks - Current utilization: 3% (extremely healthy)
⚠️ The test error will NOT happen in production - Test error was specific to DROP DATABASE operation - Production never drops the database - Different connection patterns (pooled vs rapid create/destroy)
📊 Recommended monitoring frequency: - Normal operations: Check weekly - After deployments: Check daily for 3 days - High-traffic events: Check hourly during event - If seeing performance issues: Check immediately
🔧 No immediate action required - Your configuration is production-ready!