changemaker.lite/mkdocs/docs/v2/deployment/docker-compose.md

# Docker Compose Orchestration

## Overview

Changemaker Lite V2 uses Docker Compose to orchestrate 20+ microservices in a single unified stack. This approach simplifies deployment, provides service isolation, and ensures consistent environments across development and production.

**Key Benefits:**

- **Single Configuration File**: All services defined in `docker-compose.yml`
- **Automatic Networking**: All containers communicate via a shared bridge network
- **Health Checks**: 7 critical services have automated health monitoring
- **Volume Persistence**: Database, uploads, and configuration data persisted across restarts
- **Profile Support**: Optional monitoring stack behind `--profile monitoring` flag
- **Container Dependencies**: Services start in correct order via `depends_on` relationships

**Architecture:**

The V2 stack consolidates all services into a single Docker Compose file, replacing the fragmented V1 approach. Services are organized into logical groups: Core (API, database, admin), Supporting (NocoDB, Listmonk, Gitea), Media (media-api, public-media), and Monitoring (Prometheus, Grafana, exporters).

---

## Service Architecture

```mermaid
graph TB
    subgraph "Core Services"
        NGINX[nginx<br/>:80, :443]
        API[api<br/>Express :4000]
        MEDIA[media-api<br/>Fastify :4100]
        ADMIN[admin<br/>Vite :3000]
        PG[v2-postgres<br/>PostgreSQL 16]
        REDIS[redis<br/>:6379]
    end

    subgraph "Supporting Services"
        NOCODB[nocodb-v2<br/>:8091]
        LISTMONK[listmonk-app<br/>:9000]
        LISTMONK_DB[listmonk-db<br/>PostgreSQL 17]
        MAILHOG[mailhog<br/>:8025]
        GITEA[gitea-app<br/>:3000]
        GITEA_DB[gitea-db<br/>MySQL 8]
        N8N[n8n<br/>:5678]
        MKDOCS[mkdocs<br/>:8000]
        CODE[code-server<br/>:8080]
        HOMEPAGE[homepage<br/>:3000]
        MINIQR[mini-qr<br/>:8080]
    end

    subgraph "Media Services"
        PUBLIC_MEDIA[public-media<br/>:80]
    end

    subgraph "Tunnel Services"
        NEWT[newt<br/>Pangolin connector]
    end

    subgraph "Monitoring Services (profile: monitoring)"
        PROMETHEUS[prometheus<br/>:9090]
        GRAFANA[grafana<br/>:3000]
        CADVISOR[cadvisor<br/>:8080]
        NODE_EXPORTER[node-exporter<br/>:9100]
        REDIS_EXPORTER[redis-exporter<br/>:9121]
        ALERTMANAGER[alertmanager<br/>:9093]
        GOTIFY[gotify<br/>:80]
    end

    NGINX --> API
    NGINX --> MEDIA
    NGINX --> ADMIN
    NGINX --> NOCODB
    NGINX --> LISTMONK
    NGINX --> GITEA
    NGINX --> N8N
    NGINX --> MKDOCS
    NGINX --> CODE
    NGINX --> HOMEPAGE
    NGINX --> MINIQR
    NGINX --> MAILHOG
    NGINX --> PUBLIC_MEDIA

    API --> PG
    API --> REDIS
    MEDIA --> PG
    ADMIN --> API
    ADMIN --> MEDIA
    NOCODB --> PG
    LISTMONK --> LISTMONK_DB
    GITEA --> GITEA_DB
    NEWT --> NGINX

    PROMETHEUS --> API
    PROMETHEUS --> REDIS_EXPORTER
    PROMETHEUS --> CADVISOR
    PROMETHEUS --> NODE_EXPORTER
    GRAFANA --> PROMETHEUS
    ALERTMANAGER --> PROMETHEUS
```

---

## Core Services

### v2-postgres

**Purpose**: PostgreSQL 16 database for V2 platform (main app + NocoDB metadata)

**Configuration**:
```yaml
v2-postgres:
  image: postgres:16-alpine
  container_name: changemaker-v2-postgres
  restart: unless-stopped
  ports:
    - "127.0.0.1:5433:5432"  # Localhost only
  environment:
    POSTGRES_USER: ${V2_POSTGRES_USER:-changemaker}
    POSTGRES_PASSWORD: ${V2_POSTGRES_PASSWORD}
    POSTGRES_DB: ${V2_POSTGRES_DB:-changemaker_v2}
  volumes:
    - v2-postgres-data:/var/lib/postgresql/data
    - ./api/prisma/init-nocodb-db.sh:/docker-entrypoint-initdb.d/init-nocodb-db.sh:ro
  healthcheck:
    test: ["CMD-SHELL", "pg_isready -U changemaker"]
    interval: 10s
    timeout: 5s
    retries: 5
```

**Key Features**:
- Alpine image for minimal footprint
- `init-nocodb-db.sh` creates separate `nocodb_meta` database on first startup
- Health check uses `pg_isready` for fast readiness detection
- Port bound to `127.0.0.1` to prevent external access

**Volumes**:
- `v2-postgres-data`: Persistent PostgreSQL data directory

**Dependencies**: None (starts first)

---

### redis

**Purpose**: Shared Redis instance for sessions, BullMQ job queues, rate limiting, and geocoding cache

**Configuration**:
```yaml
redis:
  image: redis:7-alpine
  container_name: redis-changemaker
  command: redis-server --appendonly yes --maxmemory 512mb --maxmemory-policy allkeys-lru --requirepass "${REDIS_PASSWORD}"
  ports:
    - "6379:6379"
  volumes:
    - redis-data:/data
  healthcheck:
    test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD}", "ping"]
    interval: 10s
    timeout: 5s
    retries: 5
  deploy:
    resources:
      limits:
        cpus: '1'
        memory: 512M
      reservations:
        cpus: '0.25'
        memory: 256M
```

**Key Features**:
- **Authentication required**: `--requirepass` flag enforces password on all connections
- **AOF persistence**: `--appendonly yes` writes every command to disk
- **Memory limits**: 512MB max with LRU eviction policy
- **Resource constraints**: Prevents Redis from consuming excessive host resources

**Volumes**:
- `redis-data`: Persistent AOF log and RDB snapshots

**Security Note**: As of Security Audit 2025-02-11, Redis authentication is **REQUIRED** in production. Set a strong `REDIS_PASSWORD` in `.env`.

---

### api

**Purpose**: Unified Express.js API (TypeScript, Prisma ORM)

**Configuration**:
```yaml
api:
  build:
    context: ./api
    target: development
  container_name: changemaker-v2-api
  restart: unless-stopped
  ports:
    - "${API_PORT:-4000}:4000"
    - "${LISTMONK_PROXY_PORT:-9002}:9002"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://localhost:4000/api/health"]
    interval: 15s
    timeout: 5s
    retries: 3
    start_period: 30s
  environment:
    - NODE_ENV=${NODE_ENV:-development}
    - PORT=4000
    - DATABASE_URL=postgresql://${V2_POSTGRES_USER}:${V2_POSTGRES_PASSWORD}@changemaker-v2-postgres:5432/${V2_POSTGRES_DB}
    - REDIS_URL=redis://:${REDIS_PASSWORD}@redis-changemaker:6379
    - JWT_ACCESS_SECRET=${JWT_ACCESS_SECRET}
    - JWT_REFRESH_SECRET=${JWT_REFRESH_SECRET}
    # ... 30+ additional env vars (see .env.example)
  volumes:
    - ./api:/app
    - /app/node_modules
    - ./assets/uploads:/app/uploads
    - ./mkdocs:/mkdocs:rw
    - ./data:/data:ro
    - /var/run/docker.sock:/var/run/docker.sock  # For Docker service management
  depends_on:
    v2-postgres:
      condition: service_healthy
    redis:
      condition: service_healthy
```

**Key Features**:
- Waits for PostgreSQL + Redis to be healthy before starting
- Mounts source code for live reloading in development
- Docker socket access for managing MkDocs/Code Server containers
- Health check on `/api/health` endpoint with 30s startup grace period
- Exposes Listmonk proxy on port 9002 (OAuth integration)

**Volumes**:
- `./api:/app`: Live code reloading
- `/app/node_modules`: Prevents host node_modules conflicts
- `./assets/uploads:/app/uploads`: Shared upload directory
- `./mkdocs:/mkdocs:rw`: MkDocs export target
- `./data:/data:ro`: NAR import data (read-only)
- `/var/run/docker.sock`: Docker API access

**Environment Variables**: See [Environment Variables](environment-variables.md) for complete reference.

---

### media-api

**Purpose**: Fastify microservice for video library management (Drizzle ORM)

**Configuration**:
```yaml
media-api:
  build:
    context: ./api
    dockerfile: Dockerfile.media
    target: development
  container_name: changemaker-media-api
  restart: unless-stopped
  ports:
    - "${MEDIA_API_PORT:-4100}:4100"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:4100/health"]
    interval: 15s
    timeout: 5s
    retries: 3
    start_period: 30s
  environment:
    - NODE_ENV=${NODE_ENV:-development}
    - MEDIA_API_PORT=4100
    - DATABASE_URL=postgresql://...  # Same DB as main API
    - ENABLE_MEDIA_FEATURES=${ENABLE_MEDIA_FEATURES:-true}
    - MAX_UPLOAD_SIZE_GB=${MAX_UPLOAD_SIZE_GB:-10}
  volumes:
    - ./api:/app
    - /app/node_modules
    - ${MEDIA_ROOT:-./media}:/media:ro
    - ${MEDIA_ROOT:-./media}/local/inbox:/media/local/inbox:rw  # Upload inbox
  depends_on:
    v2-postgres:
      condition: service_healthy
```

**Key Features**:
- Separate Dockerfile (`Dockerfile.media`) with FFmpeg/FFprobe installed
- Shares PostgreSQL database with main API (different ORM)
- Media library mounted read-only, inbox writable for uploads
- 10GB upload size limit (configurable)

**Volumes**:
- `${MEDIA_ROOT}:/media:ro`: Read-only media library
- `${MEDIA_ROOT}/local/inbox:/media/local/inbox:rw`: **RW mount required** for video uploads

**Important**: The inbox directory **must** have `:rw` flag; main library stays `:ro` for security.

---

### admin

**Purpose**: React admin GUI (Vite dev server in development, Nginx in production)

**Configuration**:
```yaml
admin:
  build:
    context: ./admin
    target: development
  container_name: changemaker-v2-admin
  restart: unless-stopped
  ports:
    - "${ADMIN_PORT:-3000}:3000"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:3000/"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 20s
  environment:
    - VITE_API_URL=http://changemaker-v2-api:4000
    - VITE_MEDIA_API_URL=http://changemaker-media-api:4100
    - VITE_MKDOCS_URL=http://mkdocs-changemaker:8000
  volumes:
    - ./admin:/app
    - /app/node_modules
  depends_on:
    - api
```

**Key Features**:
- Vite environment variables use **container hostnames** (not localhost)
- Health check on root path (Vite dev server responds with HTML)
- Live reloading via mounted source code

**Environment Variables**:
- `VITE_API_URL`: Points to API container (not localhost)
- `VITE_MEDIA_API_URL`: Points to media-api container
- `VITE_MKDOCS_URL`: Points to MkDocs container for iframe embed

**Production Build**: Swap `target: development` to `target: production` and serve static files via Nginx.

---

### nginx

**Purpose**: Reverse proxy with subdomain routing, SSL termination, and iframe embedding support

**Configuration**:
```yaml
nginx:
  build:
    context: ./nginx
  container_name: changemaker-v2-nginx
  restart: unless-stopped
  ports:
    - "${NGINX_HTTP_PORT:-80}:80"
    - "${NGINX_HTTPS_PORT:-443}:443"
    - "8881:8881"  # NocoDB embed proxy
    - "8882:8882"  # n8n embed proxy
    - "8883:8883"  # Gitea embed proxy
    - "8884:8884"  # MailHog embed proxy
    - "8885:8885"  # Mini QR embed proxy
  healthcheck:
    test: ["CMD", "sh", "-c", "wget -q --spider http://127.0.0.1:80/ && pgrep crond"]
    interval: 30s
    timeout: 5s
    retries: 3
  environment:
    - PANGOLIN_SITE_ID=${PANGOLIN_SITE_ID:-}
  volumes:
    - ./nginx/conf.d:/etc/nginx/conf.d:ro
    - ./public-web:/usr/share/nginx/public-web:ro
    - ./configs/pangolin:/etc/pangolin:ro
  depends_on:
    - api
    - admin
```

**Key Features**:
- **Subdomain routing**: `api.cmlite.org`, `app.cmlite.org`, `db.cmlite.org`, etc.
- **Embed proxy ports**: 888x ports strip `X-Frame-Options` for iframe embedding
- **Health check**: Validates both HTTP server + cron daemon (for cert renewal)
- **Read-only configs**: Prevents accidental modification

**Configuration Files**:
- `nginx.conf`: Global settings, gzip, security headers
- `conf.d/default.conf`: Localhost fallback + path-based routing
- `conf.d/api.conf`: API subdomain routing (**media endpoints must come before `/api/`**)
- `conf.d/services.conf`: All supporting services + CSP headers

See [Nginx Configuration](nginx.md) for complete routing details.

---

### nocodb-v2

**Purpose**: Read-only database browser for V2 schema

**Configuration**:
```yaml
nocodb-v2:
  image: nocodb/nocodb:latest
  container_name: changemaker-v2-nocodb
  restart: unless-stopped
  ports:
    - "${NOCODB_V2_PORT:-8091}:8080"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/api/v1/health"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 30s
  environment:
    NC_DB: "pg://changemaker-v2-postgres:5432?u=${V2_POSTGRES_USER}&p=${V2_POSTGRES_PASSWORD}&d=nocodb_meta"
    NC_ADMIN_EMAIL: ${NC_ADMIN_EMAIL:-admin@cmlite.org}
    NC_ADMIN_PASSWORD: ${NC_ADMIN_PASSWORD}
    NC_PUBLIC_URL: ${NC_PUBLIC_URL:-http://localhost:8091}
  volumes:
    - nocodb-v2-data:/usr/app/data
  depends_on:
    v2-postgres:
      condition: service_healthy
```

**Key Features**:
- Uses separate `nocodb_meta` database (auto-created by `init-nocodb-db.sh`)
- Health check via NocoDB API endpoint
- Read-only access recommended (grant SELECT only in production)

**Volumes**:
- `nocodb-v2-data`: NocoDB's internal file storage

**Access**: http://localhost:8091 or http://db.cmlite.org (via subdomain routing)

---

## Supporting Services

### listmonk-app

**Purpose**: Email marketing platform for newsletters (V2 syncs subscribers via REST API)

**Configuration**:
```yaml
listmonk-app:
  image: listmonk/listmonk:latest
  container_name: listmonk-app
  restart: unless-stopped
  ports:
    - "${LISTMONK_PORT:-9001}:9000"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://localhost:9000/"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 30s
  depends_on:
    - listmonk-db
  command: [sh, -c, "./listmonk --install --idempotent --yes --config '' && ./listmonk --upgrade --yes --config '' && ./listmonk --config ''"]
  environment:
    LISTMONK_app__address: 0.0.0.0:9000
    LISTMONK_db__host: listmonk-db
    LISTMONK_db__user: ${LISTMONK_DB_USER:-listmonk}
    LISTMONK_db__password: ${LISTMONK_DB_PASSWORD}
    LISTMONK_ADMIN_USER: ${LISTMONK_WEB_ADMIN_USER:-admin}
    LISTMONK_ADMIN_PASSWORD: ${LISTMONK_WEB_ADMIN_PASSWORD}
  volumes:
    - ./assets/uploads:/listmonk/uploads:rw
```

**Key Features**:
- **Idempotent init**: `--install --idempotent` runs migrations on every start (safe)
- **Auto-upgrade**: `--upgrade --yes` applies schema upgrades
- **Shared uploads**: Uses same upload directory as main API

**Database**: Uses separate PostgreSQL 17 instance (`listmonk-db`)

**API Integration**: V2 API syncs participants/locations to Listmonk lists via REST API (opt-in via `LISTMONK_SYNC_ENABLED=true`)

---

### listmonk-db

**Purpose**: PostgreSQL 17 database for Listmonk

**Configuration**:
```yaml
listmonk-db:
  image: postgres:17-alpine
  container_name: listmonk-db
  restart: unless-stopped
  ports:
    - "127.0.0.1:5432:5432"  # Localhost only
  environment:
    POSTGRES_USER: ${LISTMONK_DB_USER:-listmonk}
    POSTGRES_PASSWORD: ${LISTMONK_DB_PASSWORD}
    POSTGRES_DB: ${LISTMONK_DB_NAME:-listmonk}
  healthcheck:
    test: ["CMD-SHELL", "pg_isready -U listmonk"]
    interval: 10s
    timeout: 5s
    retries: 6
  volumes:
    - listmonk-data:/var/lib/postgresql/data
```

**Key Features**:
- Separate PostgreSQL instance (not shared with V2 database)
- Port bound to `127.0.0.1` for security

**Volumes**:
- `listmonk-data`: Persistent Listmonk database

---

### listmonk-init

**Purpose**: One-shot container to create Listmonk API user for V2 integration

**Configuration**:
```yaml
listmonk-init:
  image: postgres:17-alpine
  container_name: listmonk-init
  depends_on:
    listmonk-app:
      condition: service_started
  restart: "no"  # Runs once and exits
  environment:
    PGPASSWORD: ${LISTMONK_DB_PASSWORD}
    LISTMONK_API_USER: ${LISTMONK_API_USER:-v2-api}
    LISTMONK_API_TOKEN: ${LISTMONK_API_TOKEN}
  entrypoint: ["/bin/sh", "-c"]
  command:
    - |
      # Wait for Listmonk to create tables
      for i in $(seq 1 30); do
        if psql -h listmonk-db -U listmonk -d listmonk -c "SELECT 1 FROM users LIMIT 1" >/dev/null 2>&1; then
          break
        fi
        sleep 2
      done

      # Upsert API user
      psql -h listmonk-db -U listmonk -d listmonk -q <<SQL
      INSERT INTO users (username, password, password_login, email, name, type, user_role_id, status)
      VALUES ('$LISTMONK_API_USER', '$LISTMONK_API_TOKEN', true, '$LISTMONK_API_USER@api.internal', '$LISTMONK_API_USER', 'api', 1, 'enabled')
      ON CONFLICT (username) DO UPDATE SET password = EXCLUDED.password, status = 'enabled';
      SQL
```

**Key Features**:
- **Idempotent**: Safe to run multiple times (upserts API user)
- **Auto-configuration**: Also configures SMTP providers (MailHog + production)
- **Exit on completion**: `restart: "no"` prevents restart after success

**Important**: Listmonk API users store tokens as **plaintext** (not bcrypt), so direct SQL upsert works.

---

### gitea-app

**Purpose**: Self-hosted Git repository hosting

**Configuration**:
```yaml
gitea-app:
  image: gitea/gitea:1.23.7
  container_name: gitea-changemaker
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:3000/"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 30s
  environment:
    - GITEA__database__DB_TYPE=mysql
    - GITEA__database__HOST=gitea-db:3306
    - GITEA__server__ROOT_URL=${GITEA_ROOT_URL}
    - GITEA__server__X_FRAME_OPTIONS=  # Allow iframe embedding
    - GITEA__server__LFS_MAX_FILE_SIZE=1024  # 1GB LFS
  ports:
    - "${GITEA_WEB_PORT:-3030}:3000"
    - "${GITEA_SSH_PORT:-2222}:22"
  volumes:
    - gitea-data:/data
  depends_on:
    - gitea-db
```

**Key Features**:
- **MySQL backend**: Uses separate MySQL 8 container
- **LFS support**: 1GB max file size for large binaries
- **SSH access**: Port 2222 for Git push/pull
- **Iframe embedding**: `X_FRAME_OPTIONS` disabled for admin iframe

**Health Check**: Uses `curl` (Debian-based image) not `wget`

**Volumes**:
- `gitea-data`: Git repositories + attachments

---

### n8n

**Purpose**: Workflow automation platform

**Configuration**:
```yaml
n8n:
  image: docker.n8n.io/n8nio/n8n
  container_name: n8n-changemaker
  restart: unless-stopped
  ports:
    - "${N8N_PORT:-5678}:5678"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://localhost:5678/healthz"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 30s
  environment:
    - N8N_HOST=${N8N_HOST:-n8n.cmlite.org}
    - N8N_PROTOCOL=https
    - WEBHOOK_URL=https://${N8N_HOST}/
    - N8N_ENCRYPTION_KEY=${N8N_ENCRYPTION_KEY}
    - N8N_DEFAULT_USER_EMAIL=${N8N_USER_EMAIL}
    - N8N_DEFAULT_USER_PASSWORD=${N8N_USER_PASSWORD}
  volumes:
    - n8n-data:/home/node/.n8n
    - ./local-files:/files
```

**Key Features**:
- **HTTPS required**: `N8N_PROTOCOL=https` for webhook security
- **User management**: Creates default admin user on first start
- **File access**: `/files` directory for workflow file operations

**Health Check**: `/healthz` endpoint (Alpine image uses `wget`)

**Volumes**:
- `n8n-data`: Workflow definitions + credentials
- `./local-files:/files`: Shared file directory for workflows

---

### mkdocs

**Purpose**: Live documentation preview server (Material theme)

**Configuration**:
```yaml
mkdocs:
  image: squidfunk/mkdocs-material
  container_name: mkdocs-changemaker
  volumes:
    - ./mkdocs:/docs:rw
    - ./assets/images:/docs/assets/images:rw
  user: "${USER_ID:-1000}:${GROUP_ID:-1000}"
  ports:
    - "${MKDOCS_PORT:-4003}:8000"
  environment:
    - SITE_URL=${BASE_DOMAIN:-https://cmlite.org}
  command: serve --dev-addr=0.0.0.0:8000 --watch-theme --livereload
  restart: unless-stopped
```

**Key Features**:
- **Live reloading**: `--livereload` watches for file changes
- **User mapping**: Runs as host user to prevent permission issues
- **Port 4003**: Changed from 4000 (conflicted with API in V1)

**Volumes**:
- `./mkdocs:/docs:rw`: Documentation source (writable for MkDocs export)
- `./assets/images:/docs/assets/images:rw`: Shared image directory

**Access**: http://localhost:4003 or http://docs.cmlite.org (via subdomain routing)

---

### code-server

**Purpose**: VS Code in the browser for documentation editing

**Configuration**:
```yaml
code-server:
  build:
    context: .
    dockerfile: Dockerfile.code-server
  container_name: code-server-changemaker
  command: /home/coder/project
  environment:
    - DOCKER_USER=${USER_NAME:-coder}
  user: "${USER_ID:-1000}:${GROUP_ID:-1000}"
  volumes:
    - ./configs/code-server/.config:/home/coder/.config
    - ./configs/code-server/.local:/home/coder/.local
    - .:/home/coder/project
  ports:
    - "${CODE_SERVER_PORT:-8888}:8080"
  restart: unless-stopped
```

**Key Features**:
- **User mapping**: Runs as host user (prevents permission conflicts)
- **Project mount**: Entire repository mounted at `/home/coder/project`
- **Persistent config**: `.config` and `.local` directories preserved

**Access**: http://localhost:8888 or http://code.cmlite.org (via subdomain routing)

---

### mailhog

**Purpose**: Email capture for development/testing

**Configuration**:
```yaml
mailhog:
  image: mailhog/mailhog:latest
  container_name: mailhog-changemaker
  ports:
    - "${MAILHOG_WEB_PORT:-8025}:8025"
    # SMTP port 1025 only exposed on Docker network
  restart: unless-stopped
  logging:
    driver: "json-file"
    options:
      max-size: "5m"
      max-file: "2"
```

**Key Features**:
- **SMTP on port 1025**: Accessible only from Docker network (not exposed to host)
- **Web UI on port 8025**: View captured emails
- **Log rotation**: 5MB max size, 2 files

**Usage**: Set `EMAIL_TEST_MODE=true` in `.env` to route all emails to MailHog

**Access**: http://localhost:8025 or http://mail.cmlite.org (via subdomain routing)

---

### mini-qr

**Purpose**: QR code generation service (used by walk sheets + cut exports)

**Configuration**:
```yaml
mini-qr:
  image: ghcr.io/lyqht/mini-qr:latest
  container_name: mini-qr
  ports:
    - "${MINI_QR_PORT:-8089}:8080"
  restart: unless-stopped
```

**Key Features**:
- **Stateless**: No volumes or persistent data
- **Lightweight**: Alpine-based image

**API Integration**: V2 API has dedicated `/api/qr` routes for direct PNG generation; mini-qr used for admin iframe

**Access**: http://localhost:8089 or http://qr.cmlite.org (via subdomain routing)

---

### homepage

**Purpose**: Service dashboard with container status

**Configuration**:
```yaml
homepage:
  image: ghcr.io/gethomepage/homepage:latest
  container_name: homepage-changemaker
  ports:
    - "${HOMEPAGE_PORT:-3010}:3000"
  volumes:
    - ./configs/homepage:/app/config
    - ./assets/icons:/app/public/icons
    - ./assets/images:/app/public/images
    - /var/run/docker.sock:/var/run/docker.sock
  environment:
    - PUID=${USER_ID:-1000}
    - PGID=${DOCKER_GROUP_ID:-984}
    - HOMEPAGE_ALLOWED_HOSTS=*
  restart: unless-stopped
```

**Key Features**:
- **Docker socket access**: Reads container status
- **User mapping**: Runs as host user with Docker group
- **Custom dashboard**: Configure in `configs/homepage/`

**Access**: http://localhost:3010 or http://home.cmlite.org (via subdomain routing)

---

## Media Services

### public-media

**Purpose**: Public video gallery frontend (React production build)

**Configuration**:
```yaml
public-media:
  build:
    context: ./public-media
  container_name: changemaker-public-media
  restart: unless-stopped
  ports:
    - "${PUBLIC_MEDIA_PORT:-3100}:80"
  healthcheck:
    test: ["CMD", "wget", "-q", "--spider", "http://127.0.0.1:80/"]
    interval: 30s
    timeout: 5s
    retries: 3
    start_period: 10s
  depends_on:
    - api
    - media-api
```

**Key Features**:
- **Static build**: React app served by Nginx (not Vite dev server)
- **Fast startup**: 10s start period (static files load quickly)

**Access**: http://localhost:3100 or `/gallery/` path via main Nginx

---

## Tunnel Services

### newt

**Purpose**: Pangolin tunnel connector (replaces Cloudflare Tunnel)

**Configuration**:
```yaml
newt:
  image: fosrl/newt
  container_name: newt-changemaker
  restart: unless-stopped
  environment:
    - PANGOLIN_ENDPOINT=${PANGOLIN_ENDPOINT}
    - NEWT_ID=${PANGOLIN_NEWT_ID}
    - NEWT_SECRET=${PANGOLIN_NEWT_SECRET}
  depends_on:
    - nginx
```

**Key Features**:
- **Self-hosted**: Connects to Pangolin server at `api.bnkserve.org`
- **Nginx dependency**: All traffic routes through nginx:80
- **Auto-reconnect**: `restart: unless-stopped` handles connection drops

**Setup**: Use admin PangolinPage.tsx wizard to configure org → site → endpoint → resource

See [Tunneling](tunneling.md) for complete setup guide.

---

## Monitoring Services (profile: monitoring)

### prometheus

**Purpose**: Metrics collection and alerting

**Configuration**:
```yaml
prometheus:
  image: prom/prometheus:latest
  container_name: prometheus-changemaker
  command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus'
    - '--storage.tsdb.retention.time=30d'
  ports:
    - "${PROMETHEUS_PORT:-9090}:9090"
  volumes:
    - ./configs/prometheus:/etc/prometheus
    - prometheus-data:/prometheus
  restart: always
  profiles:
    - monitoring
```

**Key Features**:
- **30-day retention**: `--storage.tsdb.retention.time=30d`
- **Custom metrics**: 12 `cm_*` metrics from API
- **Alert rules**: `alerts.yml` defines 12+ alert conditions

**Scrape Targets**:
- `changemaker-v2-api:4000/api/metrics` (10s interval)
- `redis-exporter:9121` (15s interval)
- `cadvisor:8080` (15s interval)
- `node-exporter:9100` (15s interval)

**Access**: http://localhost:9090

See [Monitoring Stack](monitoring-stack.md) for complete configuration.

---

### grafana

**Purpose**: Metrics visualization

**Configuration**:
```yaml
grafana:
  image: grafana/grafana:latest
  container_name: grafana-changemaker
  ports:
    - "${GRAFANA_PORT:-3001}:3000"
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
    - GF_USERS_ALLOW_SIGN_UP=false
    - GF_SECURITY_ALLOW_EMBEDDING=true  # For admin iframe
  volumes:
    - grafana-data:/var/lib/grafana
    - ./configs/grafana:/etc/grafana/provisioning
  restart: always
  depends_on:
    - prometheus
  profiles:
    - monitoring
```

**Key Features**:
- **Auto-provisioning**: Dashboards from `configs/grafana/` auto-load on startup
- **3 dashboards**: Application Overview, API Performance, System Health
- **Prometheus datasource**: Auto-configured via `datasources.yml`

**Access**: http://localhost:3001 (admin/admin default)

---

### cadvisor

**Purpose**: Container resource metrics

**Configuration**:
```yaml
cadvisor:
  image: gcr.io/cadvisor/cadvisor:latest
  container_name: cadvisor-changemaker
  ports:
    - "${CADVISOR_PORT:-8080}:8080"
  volumes:
    - /:/rootfs:ro
    - /var/run:/var/run:ro
    - /sys:/sys:ro
    - /var/lib/docker/:/var/lib/docker:ro
    - /dev/disk/:/dev/disk:ro
  privileged: true
  devices:
    - /dev/kmsg
  restart: always
  profiles:
    - monitoring
```

**Key Features**:
- **Privileged mode**: Required for full system access
- **Host filesystem**: Read-only mounts for metrics collection

**Access**: http://localhost:8080

---

### node-exporter

**Purpose**: Host system metrics (CPU, memory, disk, network)

**Configuration**:
```yaml
node-exporter:
  image: prom/node-exporter:latest
  container_name: node-exporter-changemaker
  ports:
    - "${NODE_EXPORTER_PORT:-9100}:9100"
  command:
    - '--path.rootfs=/host'
    - '--path.procfs=/host/proc'
    - '--path.sysfs=/host/sys'
    - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
  volumes:
    - /proc:/host/proc:ro
    - /sys:/host/sys:ro
    - /:/rootfs:ro
  restart: always
  profiles:
    - monitoring
```

**Key Features**:
- **Host metrics**: CPU, memory, disk, network from host (not container)
- **Filesystem filters**: Excludes virtual filesystems

**Access**: http://localhost:9100/metrics

---

### redis-exporter

**Purpose**: Redis metrics (memory, commands, connections)

**Configuration**:
```yaml
redis-exporter:
  image: oliver006/redis_exporter:latest
  container_name: redis-exporter-changemaker
  ports:
    - "${REDIS_EXPORTER_PORT:-9121}:9121"
  environment:
    - REDIS_ADDR=redis:6379
    - REDIS_PASSWORD=${REDIS_PASSWORD}  # Required for authenticated Redis
  restart: always
  depends_on:
    - redis
  profiles:
    - monitoring
```

**Key Features**:
- **Authenticated connection**: Uses `REDIS_PASSWORD` env var
- **Memory metrics**: Tracks Redis memory usage

**Access**: http://localhost:9121/metrics

---

### alertmanager

**Purpose**: Alert routing and notification

**Configuration**:
```yaml
alertmanager:
  image: prom/alertmanager:latest
  container_name: alertmanager-changemaker
  ports:
    - "${ALERTMANAGER_PORT:-9093}:9093"
  volumes:
    - ./configs/alertmanager:/etc/alertmanager
    - alertmanager-data:/alertmanager
  command:
    - '--config.file=/etc/alertmanager/alertmanager.yml'
    - '--storage.path=/alertmanager'
  restart: always
  profiles:
    - monitoring
```

**Key Features**:
- **Alert grouping**: Prevents notification spam
- **Multiple receivers**: Email, Slack, webhook, Gotify

**Configuration**: Edit `configs/alertmanager/alertmanager.yml`

**Access**: http://localhost:9093

---

### gotify

**Purpose**: Push notification server (optional alert receiver)

**Configuration**:
```yaml
gotify:
  image: gotify/server:latest
  container_name: gotify-changemaker
  ports:
    - "${GOTIFY_PORT:-8889}:80"
  environment:
    - GOTIFY_DEFAULTUSER_NAME=${GOTIFY_ADMIN_USER:-admin}
    - GOTIFY_DEFAULTUSER_PASS=${GOTIFY_ADMIN_PASSWORD:-admin}
  volumes:
    - gotify-data:/app/data
  restart: always
  profiles:
    - monitoring
```

**Key Features**:
- **Push notifications**: Mobile app support (iOS/Android)
- **Webhook receiver**: Integrates with Alertmanager

**Access**: http://localhost:8889

---

## Networks & Volumes

### Networks

**changemaker-lite**: Bridge network shared by all services

```yaml
networks:
  changemaker-lite:
    driver: bridge
```

**Features**:
- **Automatic DNS**: Containers resolve each other by name (e.g., `changemaker-v2-api:4000`)
- **Isolation**: No external network access unless ports explicitly exposed
- **Service discovery**: Docker's internal DNS server (127.0.0.11)

---

### Volumes

**Named volumes** (Docker-managed, persistent across container recreation):

| Volume | Purpose | Size Estimate |
|--------|---------|---------------|
| `v2-postgres-data` | V2 PostgreSQL database | 1-10GB (depends on data) |
| `nocodb-v2-data` | NocoDB metadata + uploads | 100MB-1GB |
| `redis-data` | Redis AOF log + RDB snapshots | 50-500MB |
| `listmonk-data` | Listmonk PostgreSQL database | 100MB-5GB |
| `n8n-data` | n8n workflows + credentials | 10-100MB |
| `gitea-data` | Git repositories + attachments | 1-50GB |
| `mysql-data` | Gitea MySQL database | 100MB-2GB |
| `prometheus-data` | Prometheus TSDB (30 days) | 1-5GB |
| `grafana-data` | Grafana dashboards + config | 10-100MB |
| `alertmanager-data` | Alert state + silences | 1-10MB |
| `gotify-data` | Gotify messages + apps | 10-100MB |

**Bind mounts** (host directories):

| Bind Mount | Container Path | Purpose | Permissions |
|------------|----------------|---------|-------------|
| `./api` | `/app` | API source code | rw |
| `./admin` | `/app` | Admin source code | rw |
| `./assets/uploads` | `/app/uploads`, `/listmonk/uploads` | Shared uploads | rw |
| `./mkdocs` | `/docs`, `/mkdocs` | Documentation source | rw |
| `./data` | `/data` | NAR import data | ro |
| `./nginx/conf.d` | `/etc/nginx/conf.d` | Nginx config | ro |
| `./configs/prometheus` | `/etc/prometheus` | Prometheus config | ro |
| `./configs/grafana` | `/etc/grafana/provisioning` | Grafana config | ro |
| `/var/run/docker.sock` | `/var/run/docker.sock` | Docker API | rw |

**Important**: Media library requires special mount:
```yaml
- ${MEDIA_ROOT:-./media}:/media:ro              # Main library (read-only)
- ${MEDIA_ROOT:-./media}/local/inbox:/media/local/inbox:rw  # Upload inbox (writable)
```

---

## Starting Services

### Basic Commands

**Start all core services**:
```bash
docker compose up -d
```

**Start with monitoring stack**:
```bash
docker compose --profile monitoring up -d
```

**Start specific service**:
```bash
docker compose up -d api
```

**Start with rebuild**:
```bash
docker compose up -d --build api admin
```

**Stop all services**:
```bash
docker compose down
```

**Stop and remove volumes** (⚠️ **destroys all data**):
```bash
docker compose down -v
```

---

### Development Workflow

**1. Initial setup** (first time only):
```bash
# Start core services
docker compose up -d v2-postgres redis api admin

# Wait for API to be healthy
docker compose ps api  # Check status

# Run migrations
docker compose exec api npx prisma migrate deploy

# Seed database
docker compose exec api npx prisma db seed
```

**2. Daily development**:
```bash
# Start services
docker compose up -d v2-postgres redis api admin

# View logs (live tail)
docker compose logs -f api

# Restart single service
docker compose restart api

# Check health status
docker compose ps
```

**3. Full stack with monitoring**:
```bash
# Start everything
docker compose --profile monitoring up -d

# Check Prometheus targets
curl http://localhost:9090/api/v1/targets

# View Grafana dashboards
open http://localhost:3001
```

---

### Log Management

**View logs**:
```bash
# All services (last 50 lines)
docker compose logs --tail=50

# Specific service (live tail)
docker compose logs -f api

# Multiple services
docker compose logs -f api media-api

# With timestamps
docker compose logs -f --timestamps api

# Since timestamp
docker compose logs --since 2024-01-01T00:00:00 api
```

**Log rotation**: Configured in `docker-compose.yml` for Redis + MailHog:
```yaml
logging:
  driver: "json-file"
  options:
    max-size: "5m"
    max-file: "2"
```

---

### Health Checks

**Check service health**:
```bash
# All services (shows health status)
docker compose ps

# Filter unhealthy services
docker compose ps | grep unhealthy

# Inspect health check details
docker inspect changemaker-v2-api | jq '.[0].State.Health'
```

**Services with health checks**:
- `api`: `wget http://localhost:4000/api/health` (30s start period)
- `media-api`: `wget http://127.0.0.1:4100/health` (30s start period)
- `admin`: `wget http://127.0.0.1:3000/` (20s start period)
- `v2-postgres`: `pg_isready -U changemaker` (5 retries)
- `redis`: `redis-cli -a ${REDIS_PASSWORD} ping` (5 retries)
- `gitea-app`: `curl http://localhost:3000/` (30s start period)
- `n8n`: `wget http://localhost:5678/healthz` (30s start period)

**Dependency chains** (via `depends_on` with `condition: service_healthy`):
- `api` waits for `v2-postgres` + `redis`
- `media-api` waits for `v2-postgres`
- `nocodb-v2` waits for `v2-postgres`

See [Health Checks](healthchecks.md) for detailed configuration.

---

## Troubleshooting

### Port Conflicts

**Problem**: `Error: bind: address already in use`

**Solution**:
```bash
# Find process using port
sudo lsof -i :4000
sudo netstat -tulpn | grep :4000

# Change port in .env
echo "API_PORT=4002" >> .env

# Restart service
docker compose up -d api
```

**Common conflicts**:
- Port 3000: Homepage, Grafana, admin (set `ADMIN_PORT=3005`)
- Port 4000: API, MkDocs v1 (set `MKDOCS_PORT=4003`)
- Port 5432: Listmonk DB, system PostgreSQL (bind to 127.0.0.1 in compose file)

---

### Volume Permission Issues

**Problem**: `EACCES: permission denied` or `mkdir: cannot create directory`

**Cause**: Container user mismatch with host filesystem

**Solution**:
```bash
# Fix ownership (run on host)
sudo chown -R $USER:$USER ./api ./admin ./mkdocs ./assets

# Set USER_ID/GROUP_ID in .env
id -u  # Get your UID
id -g  # Get your GID
echo "USER_ID=$(id -u)" >> .env
echo "GROUP_ID=$(id -g)" >> .env

# Recreate containers
docker compose up -d --force-recreate
```

**Services using user mapping**:
- `mkdocs`: `user: "${USER_ID}:${GROUP_ID}"`
- `code-server`: `user: "${USER_ID}:${GROUP_ID}"`
- `homepage`: `PUID=${USER_ID}, PGID=${DOCKER_GROUP_ID}`

---

### Network Issues

**Problem**: Containers can't communicate (e.g., API can't reach Redis)

**Solution**:
```bash
# Verify network exists
docker network ls | grep changemaker-lite

# Inspect network
docker network inspect changemaker-lite

# Check container connectivity
docker compose exec api ping redis-changemaker

# Recreate network
docker compose down
docker compose up -d
```

**DNS resolution**: Containers use Docker's internal DNS (127.0.0.11). Reference services by container name:
- ✅ `redis-changemaker:6379`
- ❌ `localhost:6379` (only works if port exposed to host)

---

### Database Migration Failures

**Problem**: `prisma migrate deploy` fails with "relation already exists"

**Solution**:
```bash
# Reset database (⚠️ destroys data)
docker compose exec api npx prisma migrate reset --force

# Or: Fix manually
docker compose exec v2-postgres psql -U changemaker -d changemaker_v2

# Check migration status
docker compose exec api npx prisma migrate status

# Force resolve migration
docker compose exec api npx prisma migrate resolve --applied "20240101000000_init"
```

---

### Container Crashes / Restart Loops

**Problem**: Container repeatedly restarting

**Diagnosis**:
```bash
# Check logs for crash reason
docker compose logs --tail=100 api

# Check exit code
docker inspect changemaker-v2-api | jq '.[0].State'

# Check resource limits
docker stats changemaker-v2-api
```

**Common causes**:
- **Missing env vars**: Check `.env` file for required secrets
- **Health check failing**: Inspect health check logs
- **Out of memory**: Increase Docker memory limit or add resource constraints
- **Port binding failure**: Check for port conflicts

**Fix**:
```bash
# Restart with fresh logs
docker compose up -d --force-recreate api

# Check health
docker compose ps api
```

---

### Monitoring Stack Not Starting

**Problem**: Prometheus/Grafana containers missing

**Cause**: Monitoring services behind `profiles: [monitoring]`

**Solution**:
```bash
# Start with monitoring profile
docker compose --profile monitoring up -d

# Or: Explicitly start monitoring services
docker compose up -d prometheus grafana
```

---

### Media Upload Failures

**Problem**: Video uploads fail with `EACCES` or timeout

**Diagnosis**:
```bash
# Check media-api logs
docker compose logs -f media-api

# Verify inbox permissions
ls -la ./media/local/inbox

# Check disk space
df -h
```

**Solution**:
```bash
# Ensure inbox is writable
chmod 755 ./media/local/inbox

# Verify RW mount in docker-compose.yml
grep "inbox:rw" docker-compose.yml

# Recreate container
docker compose up -d --force-recreate media-api
```

**Important**: Inbox **must** have `:rw` flag; main library stays `:ro`.

---

## Production Deployment

### Resource Limits

**Production recommendations**:

```yaml
# Add to services in docker-compose.yml
deploy:
  resources:
    limits:
      cpus: '2'
      memory: 2G
    reservations:
      cpus: '0.5'
      memory: 512M
```

**Recommended limits**:
- `api`: 2 CPU, 2GB RAM
- `media-api`: 2 CPU, 2GB RAM (for FFprobe)
- `v2-postgres`: 2 CPU, 4GB RAM
- `redis`: 1 CPU, 512MB RAM (already set)
- `listmonk-app`: 1 CPU, 1GB RAM
- `grafana`: 1 CPU, 512MB RAM

---

### Healthcheck Tuning

**Production healthcheck configuration**:

```yaml
healthcheck:
  interval: 30s      # Check every 30s (default: 15s)
  timeout: 10s       # Allow 10s for response (default: 5s)
  retries: 5         # 5 failures before unhealthy (default: 3)
  start_period: 60s  # 60s grace period on startup (default: 30s)
```

**Rationale**:
- Longer intervals reduce overhead
- Higher retries prevent false positives
- Longer start periods for slow database migrations

---

### Log Management

**Production logging configuration**:

```yaml
# Add to all services
logging:
  driver: "json-file"
  options:
    max-size: "10m"
    max-file: "5"
```

**Alternative**: Use centralized logging (e.g., Loki + Promtail):
```yaml
logging:
  driver: "loki"
  options:
    loki-url: "http://loki:3100/loki/api/v1/push"
```

---

### Restart Policies

**Production restart policies**:
- `restart: always` — For critical services (db, redis, api)
- `restart: unless-stopped` — For most services (respects manual stops)
- `restart: on-failure` — For optional services (monitoring)

**Current configuration**: Most services use `unless-stopped` (allows manual shutdown).

---

### Backup Strategy

**Automated backups** (via cron):
```bash
# Add to crontab
0 2 * * * /home/user/changemaker.lite/scripts/backup.sh --s3 >> /var/log/changemaker-backup.log 2>&1
```

**What gets backed up**:
- V2 PostgreSQL database (pg_dump)
- Listmonk PostgreSQL database (pg_dump)
- Uploads directory (tar.gz)

See [Backup & Restore](backup-restore.md) for complete procedures.

---

### Security Hardening

**Production checklist**:
- [ ] Change all default passwords in `.env`
- [ ] Set strong `REDIS_PASSWORD` (required since Security Audit 2025-02-11)
- [ ] Bind PostgreSQL ports to `127.0.0.1` (not `0.0.0.0`)
- [ ] Enable SSL/TLS via Nginx (see [SSL/TLS](ssl-tls.md))
- [ ] Set `ENCRYPTION_KEY` (must differ from JWT secrets)
- [ ] Disable `EMAIL_TEST_MODE` (use real SMTP)
- [ ] Set `NODE_ENV=production`
- [ ] Review Nginx security headers (CSP, HSTS, Permissions-Policy)
- [ ] Restrict NocoDB to read-only access (revoke INSERT/UPDATE/DELETE)
- [ ] Enable Prometheus scraping authentication (basic auth)

---

## Related Documentation

- **[Environment Variables](environment-variables.md)** — Complete .env reference
- **[Nginx Configuration](nginx.md)** — Reverse proxy setup + subdomain routing
- **[SSL/TLS](ssl-tls.md)** — Certificate management + HTTPS setup
- **[Tunneling](tunneling.md)** — Pangolin tunnel deployment
- **[Monitoring Stack](monitoring-stack.md)** — Prometheus + Grafana configuration
- **[Backup & Restore](backup-restore.md)** — Database backup procedures
- **[Health Checks](healthchecks.md)** — Docker health check configuration
- **[Scaling](scaling.md)** — Horizontal scaling strategies