# Docker Compose Orchestration
## Overview
Changemaker Lite V2 uses Docker Compose to orchestrate 20+ microservices in a single unified stack. This approach simplifies deployment, provides service isolation, and ensures consistent environments across development and production.
**Key Benefits:**
- **Single Configuration File**: All services defined in `docker-compose.yml`
- **Automatic Networking**: All containers communicate via a shared bridge network
- **Health Checks**: 7 critical services have automated health monitoring
- **Volume Persistence**: Database, uploads, and configuration data persisted across restarts
- **Profile Support**: Optional monitoring stack behind `--profile monitoring` flag
- **Container Dependencies**: Services start in correct order via `depends_on` relationships
**Architecture:**
The V2 stack consolidates all services into a single Docker Compose file, replacing the fragmented V1 approach. Services are organized into logical groups: Core (API, database, admin), Supporting (NocoDB, Listmonk, Gitea), Media (media-api, public-media), and Monitoring (Prometheus, Grafana, exporters).
---
## Service Architecture
```mermaid
graph TB
subgraph "Core Services"
NGINX[nginx
:80, :443]
API[api
Express :4000]
MEDIA[media-api
Fastify :4100]
ADMIN[admin
Vite :3000]
PG[v2-postgres
PostgreSQL 16]
REDIS[redis
:6379]
end
subgraph "Supporting Services"
NOCODB[nocodb-v2
:8091]
LISTMONK[listmonk-app
:9000]
LISTMONK_DB[listmonk-db
PostgreSQL 17]
MAILHOG[mailhog
:8025]
GITEA[gitea-app
:3000]
GITEA_DB[gitea-db
MySQL 8]
N8N[n8n
:5678]
MKDOCS[mkdocs
:8000]
CODE[code-server
:8080]
HOMEPAGE[homepage
:3000]
MINIQR[mini-qr
:8080]
end
subgraph "Media Services"
PUBLIC_MEDIA[public-media
:80]
end
subgraph "Tunnel Services"
NEWT[newt
Pangolin connector]
end
subgraph "Monitoring Services (profile: monitoring)"
PROMETHEUS[prometheus
:9090]
GRAFANA[grafana
:3000]
CADVISOR[cadvisor
:8080]
NODE_EXPORTER[node-exporter
:9100]
REDIS_EXPORTER[redis-exporter
:9121]
ALERTMANAGER[alertmanager
:9093]
GOTIFY[gotify
:80]
end
NGINX --> API
NGINX --> MEDIA
NGINX --> ADMIN
NGINX --> NOCODB
NGINX --> LISTMONK
NGINX --> GITEA
NGINX --> N8N
NGINX --> MKDOCS
NGINX --> CODE
NGINX --> HOMEPAGE
NGINX --> MINIQR
NGINX --> MAILHOG
NGINX --> PUBLIC_MEDIA
API --> PG
API --> REDIS
MEDIA --> PG
ADMIN --> API
ADMIN --> MEDIA
NOCODB --> PG
LISTMONK --> LISTMONK_DB
GITEA --> GITEA_DB
NEWT --> NGINX
PROMETHEUS --> API
PROMETHEUS --> REDIS_EXPORTER
PROMETHEUS --> CADVISOR
PROMETHEUS --> NODE_EXPORTER
GRAFANA --> PROMETHEUS
ALERTMANAGER --> PROMETHEUS
```
---
## Core Services
### v2-postgres
**Purpose**: PostgreSQL 16 database for V2 platform (main app + NocoDB metadata)
**Configuration**:
```yaml
v2-postgres:
image: postgres:16-alpine
container_name: changemaker-v2-postgres
restart: unless-stopped
ports:
- "127.0.0.1:5433:5432" # Localhost only
environment:
POSTGRES_USER: ${V2_POSTGRES_USER:-changemaker}
POSTGRES_PASSWORD: ${V2_POSTGRES_PASSWORD}
POSTGRES_DB: ${V2_POSTGRES_DB:-changemaker_v2}
volumes:
- v2-postgres-data:/var/lib/postgresql/data
- ./api/prisma/init-nocodb-db.sh:/docker-entrypoint-initdb.d/init-nocodb-db.sh:ro
healthcheck:
test: ["CMD-SHELL", "pg_isready -U changemaker"]
interval: 10s
timeout: 5s
retries: 5
```
**Key Features**:
- Alpine image for minimal footprint
- `init-nocodb-db.sh` creates separate `nocodb_meta` database on first startup
- Health check uses `pg_isready` for fast readiness detection
- Port bound to `127.0.0.1` to prevent external access
**Volumes**:
- `v2-postgres-data`: Persistent PostgreSQL data directory
**Dependencies**: None (starts first)
---
### redis
**Purpose**: Shared Redis instance for sessions, BullMQ job queues, rate limiting, and geocoding cache
**Configuration**:
```yaml
redis:
image: redis:7-alpine
container_name: redis-changemaker
command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru --requirepass "${REDIS_PASSWORD}"
ports:
- "6379:6379"
volumes:
- redis-data:/data
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
cpus: '1'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
```
**Key Features**:
- **Authentication required**: `--requirepass` flag enforces password on all connections
- **AOF persistence**: `--appendonly yes` writes every command to disk
- **Memory limits**: 512MB max with LRU eviction policy
- **Resource constraints**: Prevents Redis from consuming excessive host resources
**Volumes**:
- `redis-data`: Persistent AOF log and RDB snapshots
**Security Note**: As of Security Audit 2025-02-11, Redis authentication is **REQUIRED** in production. Set a strong `REDIS_PASSWORD` in `.env`.
---
### api
**Purpose**: Unified Express.js API (TypeScript, Prisma ORM)
**Configuration**:
```yaml
api:
build:
context: ./api
target: development
container_name: changemaker-v2-api
restart: unless-stopped
ports:
- "${API_PORT:-4000}:4000"
- "${LISTMONK_PROXY_PORT:-9002}:9002"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:4000/api/health"]
interval: 15s
timeout: 5s
retries: 3
start_period: 30s
environment:
- NODE_ENV=${NODE_ENV:-development}
- PORT=4000
- DATABASE_URL=postgresql://${V2_POSTGRES_USER}:${V2_POSTGRES_PASSWORD}@changemaker-v2-postgres:5432/${V2_POSTGRES_DB}
- REDIS_URL=redis://:${REDIS_PASSWORD}@redis-changemaker:6379
- JWT_ACCESS_SECRET=${JWT_ACCESS_SECRET}
- JWT_REFRESH_SECRET=${JWT_REFRESH_SECRET}
# ... 30+ additional env vars (see .env.example)
volumes:
- ./api:/app
- /app/node_modules
- ./assets/uploads:/app/uploads
- ./mkdocs:/mkdocs:rw
- ./data:/data:ro
- /var/run/docker.sock:/var/run/docker.sock # For Docker service management
depends_on:
v2-postgres:
condition: service_healthy
redis:
condition: service_healthy
```
**Key Features**:
- Waits for PostgreSQL + Redis to be healthy before starting
- Mounts source code for live reloading in development
- Docker socket access for managing MkDocs/Code Server containers
- Health check on `/api/health` endpoint with 30s startup grace period
- Exposes Listmonk proxy on port 9002 (OAuth integration)
**Volumes**:
- `./api:/app`: Live code reloading
- `/app/node_modules`: Prevents host node_modules conflicts
- `./assets/uploads:/app/uploads`: Shared upload directory
- `./mkdocs:/mkdocs:rw`: MkDocs export target
- `./data:/data:ro`: NAR import data (read-only)
- `/var/run/docker.sock`: Docker API access
**Environment Variables**: See [Environment Variables](environment-variables.md) for complete reference.
---
### media-api
**Purpose**: Fastify microservice for video library management (Drizzle ORM)
**Configuration**:
```yaml
media-api:
build:
context: ./api
dockerfile: Dockerfile.media
target: development
container_name: changemaker-media-api
restart: unless-stopped
ports:
- "${MEDIA_API_PORT:-4100}:4100"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:4100/health"]
interval: 15s
timeout: 5s
retries: 3
start_period: 30s
environment:
- NODE_ENV=${NODE_ENV:-development}
- MEDIA_API_PORT=4100
- DATABASE_URL=postgresql://... # Same DB as main API
- ENABLE_MEDIA_FEATURES=${ENABLE_MEDIA_FEATURES:-true}
- MAX_UPLOAD_SIZE_GB=${MAX_UPLOAD_SIZE_GB:-10}
volumes:
- ./api:/app
- /app/node_modules
- ${MEDIA_ROOT:-./media}:/media:ro
- ${MEDIA_ROOT:-./media}/local/inbox:/media/local/inbox:rw # Upload inbox
depends_on:
v2-postgres:
condition: service_healthy
```
**Key Features**:
- Separate Dockerfile (`Dockerfile.media`) with FFmpeg/FFprobe installed
- Shares PostgreSQL database with main API (different ORM)
- Media library mounted read-only, inbox writable for uploads
- 10GB upload size limit (configurable)
**Volumes**:
- `${MEDIA_ROOT}:/media:ro`: Read-only media library
- `${MEDIA_ROOT}/local/inbox:/media/local/inbox:rw`: **RW mount required** for video uploads
**Important**: The inbox directory **must** have `:rw` flag; main library stays `:ro` for security.
---
### admin
**Purpose**: React admin GUI (Vite dev server in development, Nginx in production)
**Configuration**:
```yaml
admin:
build:
context: ./admin
target: development
container_name: changemaker-v2-admin
restart: unless-stopped
ports:
- "${ADMIN_PORT:-3000}:3000"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:3000/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 20s
environment:
- VITE_API_URL=http://changemaker-v2-api:4000
- VITE_MEDIA_API_URL=http://changemaker-media-api:4100
- VITE_MKDOCS_URL=http://mkdocs-changemaker:8000
volumes:
- ./admin:/app
- /app/node_modules
depends_on:
- api
```
**Key Features**:
- Vite environment variables use **container hostnames** (not localhost)
- Health check on root path (Vite dev server responds with HTML)
- Live reloading via mounted source code
**Environment Variables**:
- `VITE_API_URL`: Points to API container (not localhost)
- `VITE_MEDIA_API_URL`: Points to media-api container
- `VITE_MKDOCS_URL`: Points to MkDocs container for iframe embed
**Production Build**: Swap `target: development` to `target: production` and serve static files via Nginx.
---
### nginx
**Purpose**: Reverse proxy with subdomain routing, SSL termination, and iframe embedding support
**Configuration**:
```yaml
nginx:
build:
context: ./nginx
container_name: changemaker-v2-nginx
restart: unless-stopped
ports:
- "${NGINX_HTTP_PORT:-80}:80"
- "${NGINX_HTTPS_PORT:-443}:443"
- "8881:8881" # NocoDB embed proxy
- "8882:8882" # n8n embed proxy
- "8883:8883" # Gitea embed proxy
- "8884:8884" # MailHog embed proxy
- "8885:8885" # Mini QR embed proxy
healthcheck:
test: ["CMD", "sh", "-c", "wget -q --spider http://127.0.0.1:80/ && pgrep crond"]
interval: 30s
timeout: 5s
retries: 3
environment:
- PANGOLIN_SITE_ID=${PANGOLIN_SITE_ID:-}
volumes:
- ./nginx/conf.d:/etc/nginx/conf.d:ro
- ./public-web:/usr/share/nginx/public-web:ro
- ./configs/pangolin:/etc/pangolin:ro
depends_on:
- api
- admin
```
**Key Features**:
- **Subdomain routing**: `api.cmlite.org`, `app.cmlite.org`, `db.cmlite.org`, etc.
- **Embed proxy ports**: 888x ports strip `X-Frame-Options` for iframe embedding
- **Health check**: Validates both HTTP server + cron daemon (for cert renewal)
- **Read-only configs**: Prevents accidental modification
**Configuration Files**:
- `nginx.conf`: Global settings, gzip, security headers
- `conf.d/default.conf`: Localhost fallback + path-based routing
- `conf.d/api.conf`: API subdomain routing (**media endpoints must come before `/api/`**)
- `conf.d/services.conf`: All supporting services + CSP headers
See [Nginx Configuration](nginx.md) for complete routing details.
---
### nocodb-v2
**Purpose**: Read-only database browser for V2 schema
**Configuration**:
```yaml
nocodb-v2:
image: nocodb/nocodb:latest
container_name: changemaker-v2-nocodb
restart: unless-stopped
ports:
- "${NOCODB_V2_PORT:-8091}:8080"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/api/v1/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
environment:
NC_DB: "pg://changemaker-v2-postgres:5432?u=${V2_POSTGRES_USER}&p=${V2_POSTGRES_PASSWORD}&d=nocodb_meta"
NC_ADMIN_EMAIL: ${NC_ADMIN_EMAIL:-admin@cmlite.org}
NC_ADMIN_PASSWORD: ${NC_ADMIN_PASSWORD}
NC_PUBLIC_URL: ${NC_PUBLIC_URL:-http://localhost:8091}
volumes:
- nocodb-v2-data:/usr/app/data
depends_on:
v2-postgres:
condition: service_healthy
```
**Key Features**:
- Uses separate `nocodb_meta` database (auto-created by `init-nocodb-db.sh`)
- Health check via NocoDB API endpoint
- Read-only access recommended (grant SELECT only in production)
**Volumes**:
- `nocodb-v2-data`: NocoDB's internal file storage
**Access**: http://localhost:8091 or http://db.cmlite.org (via subdomain routing)
---
## Supporting Services
### listmonk-app
**Purpose**: Email marketing platform for newsletters (V2 syncs subscribers via REST API)
**Configuration**:
```yaml
listmonk-app:
image: listmonk/listmonk:latest
container_name: listmonk-app
restart: unless-stopped
ports:
- "${LISTMONK_PORT:-9001}:9000"
healthcheck:
test: ["CMD", "wget", "-q", "--spider", "http://localhost:9000/"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
depends_on:
- listmonk-db
command: [sh, -c, "./listmonk --install --idempotent --yes --config '' && ./listmonk --upgrade --yes --config '' && ./listmonk --config ''"]
environment:
LISTMONK_app__address: 0.0.0.0:9000
LISTMONK_db__host: listmonk-db
LISTMONK_db__user: ${LISTMONK_DB_USER:-listmonk}
LISTMONK_db__password: ${LISTMONK_DB_PASSWORD}
LISTMONK_ADMIN_USER: ${LISTMONK_WEB_ADMIN_USER:-admin}
LISTMONK_ADMIN_PASSWORD: ${LISTMONK_WEB_ADMIN_PASSWORD}
volumes:
- ./assets/uploads:/listmonk/uploads:rw
```
**Key Features**:
- **Idempotent init**: `--install --idempotent` runs migrations on every start (safe)
- **Auto-upgrade**: `--upgrade --yes` applies schema upgrades
- **Shared uploads**: Uses same upload directory as main API
**Database**: Uses separate PostgreSQL 17 instance (`listmonk-db`)
**API Integration**: V2 API syncs participants/locations to Listmonk lists via REST API (opt-in via `LISTMONK_SYNC_ENABLED=true`)
---
### listmonk-db
**Purpose**: PostgreSQL 17 database for Listmonk
**Configuration**:
```yaml
listmonk-db:
image: postgres:17-alpine
container_name: listmonk-db
restart: unless-stopped
ports:
- "127.0.0.1:5432:5432" # Localhost only
environment:
POSTGRES_USER: ${LISTMONK_DB_USER:-listmonk}
POSTGRES_PASSWORD: ${LISTMONK_DB_PASSWORD}
POSTGRES_DB: ${LISTMONK_DB_NAME:-listmonk}
healthcheck:
test: ["CMD-SHELL", "pg_isready -U listmonk"]
interval: 10s
timeout: 5s
retries: 6
volumes:
- listmonk-data:/var/lib/postgresql/data
```
**Key Features**:
- Separate PostgreSQL instance (not shared with V2 database)
- Port bound to `127.0.0.1` for security
**Volumes**:
- `listmonk-data`: Persistent Listmonk database
---
### listmonk-init
**Purpose**: One-shot container to create Listmonk API user for V2 integration
**Configuration**:
```yaml
listmonk-init:
image: postgres:17-alpine
container_name: listmonk-init
depends_on:
listmonk-app:
condition: service_started
restart: "no" # Runs once and exits
environment:
PGPASSWORD: ${LISTMONK_DB_PASSWORD}
LISTMONK_API_USER: ${LISTMONK_API_USER:-v2-api}
LISTMONK_API_TOKEN: ${LISTMONK_API_TOKEN}
entrypoint: ["/bin/sh", "-c"]
command:
- |
# Wait for Listmonk to create tables
for i in $(seq 1 30); do
if psql -h listmonk-db -U listmonk -d listmonk -c "SELECT 1 FROM users LIMIT 1" >/dev/null 2>&1; then
break
fi
sleep 2
done
# Upsert API user
psql -h listmonk-db -U listmonk -d listmonk -q <> .env
# Restart service
docker compose up -d api
```
**Common conflicts**:
- Port 3000: Homepage, Grafana, admin (set `ADMIN_PORT=3005`)
- Port 4000: API, MkDocs v1 (set `MKDOCS_PORT=4003`)
- Port 5432: Listmonk DB, system PostgreSQL (bind to 127.0.0.1 in compose file)
---
### Volume Permission Issues
**Problem**: `EACCES: permission denied` or `mkdir: cannot create directory`
**Cause**: Container user mismatch with host filesystem
**Solution**:
```bash
# Fix ownership (run on host)
sudo chown -R $USER:$USER ./api ./admin ./mkdocs ./assets
# Set USER_ID/GROUP_ID in .env
id -u # Get your UID
id -g # Get your GID
echo "USER_ID=$(id -u)" >> .env
echo "GROUP_ID=$(id -g)" >> .env
# Recreate containers
docker compose up -d --force-recreate
```
**Services using user mapping**:
- `mkdocs`: `user: "${USER_ID}:${GROUP_ID}"`
- `code-server`: `user: "${USER_ID}:${GROUP_ID}"`
- `homepage`: `PUID=${USER_ID}, PGID=${DOCKER_GROUP_ID}`
---
### Network Issues
**Problem**: Containers can't communicate (e.g., API can't reach Redis)
**Solution**:
```bash
# Verify network exists
docker network ls | grep changemaker-lite
# Inspect network
docker network inspect changemaker-lite
# Check container connectivity
docker compose exec api ping redis-changemaker
# Recreate network
docker compose down
docker compose up -d
```
**DNS resolution**: Containers use Docker's internal DNS (127.0.0.11). Reference services by container name:
- ✅ `redis-changemaker:6379`
- ❌ `localhost:6379` (only works if port exposed to host)
---
### Database Migration Failures
**Problem**: `prisma migrate deploy` fails with "relation already exists"
**Solution**:
```bash
# Reset database (⚠️ destroys data)
docker compose exec api npx prisma migrate reset --force
# Or: Fix manually
docker compose exec v2-postgres psql -U changemaker -d changemaker_v2
# Check migration status
docker compose exec api npx prisma migrate status
# Force resolve migration
docker compose exec api npx prisma migrate resolve --applied "20240101000000_init"
```
---
### Container Crashes / Restart Loops
**Problem**: Container repeatedly restarting
**Diagnosis**:
```bash
# Check logs for crash reason
docker compose logs --tail=100 api
# Check exit code
docker inspect changemaker-v2-api | jq '.[0].State'
# Check resource limits
docker stats changemaker-v2-api
```
**Common causes**:
- **Missing env vars**: Check `.env` file for required secrets
- **Health check failing**: Inspect health check logs
- **Out of memory**: Increase Docker memory limit or add resource constraints
- **Port binding failure**: Check for port conflicts
**Fix**:
```bash
# Restart with fresh logs
docker compose up -d --force-recreate api
# Check health
docker compose ps api
```
---
### Monitoring Stack Not Starting
**Problem**: Prometheus/Grafana containers missing
**Cause**: Monitoring services behind `profiles: [monitoring]`
**Solution**:
```bash
# Start with monitoring profile
docker compose --profile monitoring up -d
# Or: Explicitly start monitoring services
docker compose up -d prometheus grafana
```
---
### Media Upload Failures
**Problem**: Video uploads fail with `EACCES` or timeout
**Diagnosis**:
```bash
# Check media-api logs
docker compose logs -f media-api
# Verify inbox permissions
ls -la ./media/local/inbox
# Check disk space
df -h
```
**Solution**:
```bash
# Ensure inbox is writable
chmod 755 ./media/local/inbox
# Verify RW mount in docker-compose.yml
grep "inbox:rw" docker-compose.yml
# Recreate container
docker compose up -d --force-recreate media-api
```
**Important**: Inbox **must** have `:rw` flag; main library stays `:ro`.
---
## Production Deployment
### Resource Limits
**Production recommendations**:
```yaml
# Add to services in docker-compose.yml
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '0.5'
memory: 512M
```
**Recommended limits**:
- `api`: 2 CPU, 2GB RAM
- `media-api`: 2 CPU, 2GB RAM (for FFprobe)
- `v2-postgres`: 2 CPU, 4GB RAM
- `redis`: 1 CPU, 512MB RAM (already set)
- `listmonk-app`: 1 CPU, 1GB RAM
- `grafana`: 1 CPU, 512MB RAM
---
### Healthcheck Tuning
**Production healthcheck configuration**:
```yaml
healthcheck:
interval: 30s # Check every 30s (default: 15s)
timeout: 10s # Allow 10s for response (default: 5s)
retries: 5 # 5 failures before unhealthy (default: 3)
start_period: 60s # 60s grace period on startup (default: 30s)
```
**Rationale**:
- Longer intervals reduce overhead
- Higher retries prevent false positives
- Longer start periods for slow database migrations
---
### Log Management
**Production logging configuration**:
```yaml
# Add to all services
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "5"
```
**Alternative**: Use centralized logging (e.g., Loki + Promtail):
```yaml
logging:
driver: "loki"
options:
loki-url: "http://loki:3100/loki/api/v1/push"
```
---
### Restart Policies
**Production restart policies**:
- `restart: always` — For critical services (db, redis, api)
- `restart: unless-stopped` — For most services (respects manual stops)
- `restart: on-failure` — For optional services (monitoring)
**Current configuration**: Most services use `unless-stopped` (allows manual shutdown).
---
### Backup Strategy
**Automated backups** (via cron):
```bash
# Add to crontab
0 2 * * * /home/user/changemaker.lite/scripts/backup.sh --s3 >> /var/log/changemaker-backup.log 2>&1
```
**What gets backed up**:
- V2 PostgreSQL database (pg_dump)
- Listmonk PostgreSQL database (pg_dump)
- Uploads directory (tar.gz)
See [Backup & Restore](backup-restore.md) for complete procedures.
---
### Security Hardening
**Production checklist**:
- [ ] Change all default passwords in `.env`
- [ ] Set strong `REDIS_PASSWORD` (required since Security Audit 2025-02-11)
- [ ] Bind PostgreSQL ports to `127.0.0.1` (not `0.0.0.0`)
- [ ] Enable SSL/TLS via Nginx (see [SSL/TLS](ssl-tls.md))
- [ ] Set `ENCRYPTION_KEY` (must differ from JWT secrets)
- [ ] Disable `EMAIL_TEST_MODE` (use real SMTP)
- [ ] Set `NODE_ENV=production`
- [ ] Review Nginx security headers (CSP, HSTS, Permissions-Policy)
- [ ] Restrict NocoDB to read-only access (revoke INSERT/UPDATE/DELETE)
- [ ] Enable Prometheus scraping authentication (basic auth)
---
## Related Documentation
- **[Environment Variables](environment-variables.md)** — Complete .env reference
- **[Nginx Configuration](nginx.md)** — Reverse proxy setup + subdomain routing
- **[SSL/TLS](ssl-tls.md)** — Certificate management + HTTPS setup
- **[Tunneling](tunneling.md)** — Pangolin tunnel deployment
- **[Monitoring Stack](monitoring-stack.md)** — Prometheus + Grafana configuration
- **[Backup & Restore](backup-restore.md)** — Database backup procedures
- **[Health Checks](healthchecks.md)** — Docker health check configuration
- **[Scaling](scaling.md)** — Horizontal scaling strategies