Production Checklist
Complete this checklist before deploying OpenPrime to production.
Security​
Authentication & Authorization​
-
Keycloak hardened
- Admin console secured (IP whitelist or VPN)
- Admin password changed from default
- Brute force protection enabled
- Password policies configured
-
HTTPS everywhere
- Valid SSL certificates installed
- HTTP redirects to HTTPS
- HSTS headers enabled
-
CORS configured
- Only allowed origins specified
- No wildcards in production
-
Rate limiting enabled
- API rate limits configured
- Login attempt limits set
Secrets Management​
-
Encryption key secured
- 32-byte random key generated
- Stored in secret manager (Vault, AWS Secrets, etc.)
- Key rotation procedure documented
-
Database credentials
- Strong passwords generated
- Stored in secret manager
- Not in environment files
-
No secrets in code
- Git history reviewed
- .env files gitignored
- Secrets scanning enabled
Network Security​
-
Database not publicly accessible
- Private subnet only
- Security groups configured
-
Network policies
- Pod-to-pod communication restricted
- Egress rules defined
-
WAF configured (if applicable)
- SQL injection protection
- XSS protection
- Rate limiting
Infrastructure​
High Availability​
-
Multiple replicas
- Frontend: 2+ replicas
- Backend: 3+ replicas
- Database: Primary + replicas
-
Multi-zone deployment
- Pods spread across availability zones
- Database in multi-AZ configuration
-
Pod disruption budgets
- Minimum available pods defined
- Rolling update strategy configured
Scalability​
-
Autoscaling configured
- HPA for frontend/backend
- CPU/memory thresholds defined
- Max replicas set appropriately
-
Resource limits
- CPU requests/limits set
- Memory requests/limits set
- Tested under load
Database​
-
Backups configured
- Automated daily backups
- Point-in-time recovery enabled
- Backup retention policy (30+ days)
- Backup restoration tested
-
Connection pooling
- PgBouncer or similar configured
- Max connections appropriate
-
Monitoring
- Slow query logging enabled
- Connection monitoring
- Storage alerts
Monitoring & Observability​
Logging​
-
Centralized logging
- All services log to central location
- Log retention policy defined
- Sensitive data filtered from logs
-
Log levels appropriate
- Production: info level
- Debug logs disabled
Metrics​
-
Application metrics
- Request latency tracked
- Error rates monitored
- Custom business metrics
-
Infrastructure metrics
- CPU/memory utilization
- Disk usage
- Network I/O
Alerting​
-
Critical alerts configured
- Service down
- High error rate (>1%)
- High latency (p99 > 2s)
- Database connection failures
-
Alert routing
- On-call schedule defined
- Escalation policy configured
- PagerDuty/Opsgenie integrated
Health Checks​
-
Liveness probes
- All services have liveness probes
- Appropriate thresholds set
-
Readiness probes
- All services have readiness probes
- Dependencies checked
Performance​
Load Testing​
-
Load tests completed
- Expected peak load tested
- 2x peak load tested
- Response times acceptable
-
Stress tests completed
- Breaking point identified
- Graceful degradation verified
Optimization​
-
Database queries optimized
- Indexes in place
- Slow queries identified and fixed
- N+1 queries eliminated
-
Caching configured
- Static assets cached (CDN)
- API responses cached where appropriate
Operations​
Deployment​
-
CI/CD pipeline
- Automated tests run
- Security scanning
- Automated deployment
-
Rollback procedure
- Quick rollback tested
- Database rollback plan
- Documented procedure
-
Blue-green or canary
- Zero-downtime deployments
- Traffic shifting capability
Disaster Recovery​
-
Recovery plan documented
- RTO defined (Recovery Time Objective)
- RPO defined (Recovery Point Objective)
- Step-by-step procedures
-
DR tested
- Backup restoration tested
- Failover tested
- Recovery time measured
Documentation​
-
Runbooks created
- Common issues documented
- Escalation procedures
- Contact information
-
Architecture documented
- Network diagrams
- Data flow diagrams
- Integration points
Compliance​
Data Protection​
-
Data classification
- Sensitive data identified
- Encryption requirements met
-
Retention policies
- Data retention defined
- Deletion procedures documented
Audit​
-
Audit logging
- User actions logged
- Admin actions logged
- Logs tamper-proof
-
Access reviews
- Regular access reviews scheduled
- Principle of least privilege applied
Final Verification​
Pre-Launch​
- All checklist items completed
- Security review completed
- Load testing passed
- DR test completed
- Team trained on operations
Launch Day​
- Monitoring dashboards ready
- On-call schedule confirmed
- Rollback plan ready
- Communication plan ready
- Support channels ready