# Bunker Ops — How-To Guide Operational handbook for managing Changemaker Lite instances with Ansible. --- ## Table of Contents 1. [Prerequisites](#1-prerequisites) 2. [Initial Setup (Control Machine)](#2-initial-setup-control-machine) 3. [Adding a New Instance](#3-adding-a-new-instance) 4. [Deploying an Instance](#4-deploying-an-instance) 5. [Day-to-Day Operations](#5-day-to-day-operations) 6. [Secret Management](#6-secret-management) 7. [Monitoring & Fleet Observability](#7-monitoring--fleet-observability) 8. [Troubleshooting](#8-troubleshooting) 9. [Variable Reference](#9-variable-reference) --- ## 1. Prerequisites ### Control Machine (your laptop / jump server) - **Ansible 2.14+** — `pip install ansible` or `apt install ansible` - **SSH access** — key-based auth to all target servers - **OpenSSL** — for secret generation (`openssl rand`) ### Target Servers (each Changemaker instance) - **Ubuntu 22.04 or 24.04** (Debian-based) - **2+ GB RAM** (4 GB recommended; swap is auto-created on low-memory hosts) - **20+ GB disk** (50 GB recommended for media features) - **SSH access** for a `deploy` user with passwordless sudo - **Outbound internet** (pulls Docker images, Git repo) - Ports 80, 443, and SSH accessible --- ## 2. Initial Setup (Control Machine) ### 2.1 Clone the repository ```bash git clone changemaker.lite cd changemaker.lite/bunker-ops ``` ### 2.2 Create a vault password This single password encrypts all per-instance secrets. Store it securely (password manager, not Git). ```bash # Generate a strong vault password openssl rand -base64 32 > .vault_pass chmod 600 .vault_pass ``` The `.vault_pass` file is in `.gitignore` and must never be committed. ### 2.3 Verify Ansible can run ```bash ansible --version ansible-playbook playbooks/deploy.yml --syntax-check ``` ### 2.4 Prepare SSH access Ensure your SSH key can reach target servers: ```bash # Test connectivity ssh deploy@10.0.1.10 "hostname && docker --version" ``` If you use a non-default SSH key: ```bash # In ansible.cfg or per-host ansible_ssh_private_key_file: ~/.ssh/bunker_ops_ed25519 ``` --- ## 3. Adding a New Instance ### 3.1 Quick method (recommended) The `add-instance.sh` script scaffolds everything: ```bash ./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10 # With fleet observability (Tier 2): ./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10 --tier 2 ``` This creates: - `inventory/host_vars/edmonton-prod/main.yml` — instance configuration - `inventory/host_vars/edmonton-prod/vault.yml` — 19+ generated secrets (encrypted) ### 3.2 Add to inventory Edit `inventory/hosts.yml` and add the host: ```yaml all: children: changemaker_instances: hosts: edmonton-prod: ansible_host: 10.0.1.10 ansible_user: deploy cml_domain: betteredmonton.org ``` ### 3.3 Customize configuration Edit `inventory/host_vars/edmonton-prod/main.yml`: ```yaml cml_domain: betteredmonton.org cml_node_env: production # Enable features cml_enable_media: "true" cml_listmonk_sync_enabled: "true" cml_email_test_mode: "false" cml_monitoring_enabled: true # Production SMTP cml_smtp_host: smtp.protonmail.ch cml_smtp_port: 587 cml_smtp_user: "noreply@betteredmonton.org" # Pangolin tunnel cml_pangolin_api_url: "https://api.bnkserve.org/v1" cml_pangolin_org_id: "org_abc123" ``` ### 3.4 Edit secrets (if needed) ```bash # Decrypt, edit, re-encrypt ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml # Or set a specific value ansible-vault decrypt inventory/host_vars/edmonton-prod/vault.yml # ... edit ... ansible-vault encrypt inventory/host_vars/edmonton-prod/vault.yml ``` ### 3.5 Verify connectivity ```bash ansible edmonton-prod -m ping ``` --- ## 4. Deploying an Instance ### 4.1 Full initial deploy Installs Docker, configures the OS, clones the repo, generates `.env`, starts all containers, runs migrations, and sets up backup cron: ```bash ansible-playbook playbooks/deploy.yml --limit edmonton-prod ``` What happens (in order): 1. **common** role — apt update, Docker install, UFW firewall, fail2ban, swap 2. **changemaker** role — git clone, create dirs, generate `.env`, `docker compose up`, Prisma migrations, seed, health checks, backup cron 3. **monitoring** role (if enabled) — Prometheus config, `--profile monitoring up` ### 4.2 Deploy all instances ```bash # One at a time (safe): ansible-playbook playbooks/deploy.yml # Show what would change (dry run): ansible-playbook playbooks/deploy.yml --check --diff ``` ### 4.3 Deploy with specific tags ```bash # Only regenerate .env (no Docker restart): ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags env # Only clone + update code: ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags clone # Only run health checks: ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags health ``` --- ## 5. Day-to-Day Operations ### 5.1 Rolling upgrade (code + images) Pulls latest Git commits, rebuilds images, runs migrations, restarts — in 25% batches: ```bash # All instances: ansible-playbook playbooks/upgrade.yml # Single instance: ansible-playbook playbooks/upgrade.yml --limit edmonton-prod ``` ### 5.2 Configuration change (no rebuild) Regenerates `.env` and restarts the API. Use when changing feature flags, SMTP settings, CORS origins, etc.: ```bash # Change a variable first: # Edit inventory/host_vars/edmonton-prod/main.yml # e.g., cml_enable_media: "true" # Then apply: ansible-playbook playbooks/configure.yml --limit edmonton-prod ``` ### 5.3 Trigger backups ```bash # All instances: ansible-playbook playbooks/backup.yml # Single instance: ansible-playbook playbooks/backup.yml --limit edmonton-prod ``` ### 5.4 Enable/reconfigure monitoring ```bash ansible-playbook playbooks/monitoring.yml --limit edmonton-prod ``` ### 5.5 Run ad-hoc commands ```bash # Check Docker status on all instances: ansible changemaker_instances -m command -a "docker compose ps" --become # View API logs on one instance: ansible edmonton-prod -m command -a "docker compose logs api --tail 50" \ --become -e "chdir=/opt/changemaker-lite" # Restart a specific service: ansible edmonton-prod -m command -a "docker compose restart api" \ --become -e "chdir=/opt/changemaker-lite" # Check disk space across fleet: ansible changemaker_instances -m command -a "df -h /" ``` ### 5.6 Rotate a secret 1. Generate a new value: ```bash openssl rand -hex 32 ``` 2. Update the vault: ```bash ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml # Change vault_cml_jwt_access_secret (or whichever secret) ``` 3. Apply and restart: ```bash ansible-playbook playbooks/configure.yml --limit edmonton-prod ``` --- ## 6. Secret Management ### Naming convention | Prefix | Purpose | Example | |--------|---------|---------| | `cml_*` | Non-secret configuration | `cml_domain`, `cml_smtp_host` | | `vault_cml_*` | Encrypted secrets | `vault_cml_v2_postgres_password` | | `vault_bunker_*` | Bunker Ops shared secrets | `vault_bunker_ops_remote_write_token` | ### What gets encrypted All 19+ secrets per instance: - Database passwords (PostgreSQL, Redis, Listmonk DB, Gitea DB) - JWT secrets (access + refresh) and encryption key - Admin passwords (initial admin, NocoDB, n8n, Grafana, Gotify, Vaultwarden, Rocket.Chat, Gancio) - API tokens (Listmonk API, Pangolin, Bunker Ops remote write) - SMTP password ### Vault operations ```bash # View encrypted file: ansible-vault view inventory/host_vars/edmonton-prod/vault.yml # Edit in-place (decrypts → opens $EDITOR → re-encrypts): ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml # Re-key all vaults (change master password): find inventory/host_vars -name vault.yml -exec ansible-vault rekey {} + # Encrypt a new plaintext file: ansible-vault encrypt inventory/host_vars/new-instance/vault.yml ``` ### Vault password management - The `.vault_pass` file is referenced in `ansible.cfg` - For CI/CD, pass via environment: `ANSIBLE_VAULT_PASSWORD=... ansible-playbook ...` - For teams, use `--vault-password-file` pointing to a shared secrets manager script --- ## 7. Monitoring & Fleet Observability ### Tier model | Tier | What it means | How to set | |------|--------------|-----------| | **0: Standalone** | No Ansible management (manual `config.sh` install) | N/A | | **1: Managed** | Ansible deploys/updates, local monitoring only | `bunker_ops_enabled: false` | | **2: Fleet** | Ansible + metrics pushed to central VictoriaMetrics | `bunker_ops_enabled: true` | ### Enabling Tier 2 on an instance 1. Set in `host_vars//main.yml`: ```yaml bunker_ops_enabled: true bunker_ops_remote_write_url: "https://ops.bnkserve.org/api/v1/write" cml_monitoring_enabled: true ``` 2. Set the write token in `host_vars//vault.yml`: ```yaml vault_bunker_ops_remote_write_token: "your-token-here" ``` 3. Apply: ```bash ansible-playbook playbooks/monitoring.yml --limit edmonton-prod ``` ### What metrics are sent (Tier 2) Only filtered, non-PII metrics leave the instance: - `cm_*` — Application metrics (emails sent, canvass visits, queue sizes, login attempts) - `node_*` — System metrics (CPU, memory, disk, network) - `http_request*` — API latency and request counts - `up` — Service availability **Never sent:** Database content, user data, campaign text, participant records, cAdvisor container details. ### Backup metrics When `BUNKER_OPS_ENABLED=true`, the backup script automatically pushes: - `cm_backup_last_success_timestamp` — Unix timestamp of last successful backup - `cm_backup_size_bytes` — Size of the backup archive These enable "backup staleness" alerts on the central dashboard. --- ## 8. Troubleshooting ### Ansible can't connect ``` UNREACHABLE! => {"msg": "Failed to connect to the host via ssh"} ``` - Verify SSH: `ssh deploy@ hostname` - Check `ansible_user` in hosts.yml matches the SSH user - Ensure the user has passwordless sudo: `echo 'deploy ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/deploy` ### Vault password error ``` ERROR! Decryption failed on ...vault.yml ``` - Verify `.vault_pass` file exists and is correct - Or pass explicitly: `ansible-playbook ... --vault-password-file /path/to/.vault_pass` ### Deploy fails at "Wait for PostgreSQL" PostgreSQL hasn't started yet. Check: ```bash ansible -m command -a "docker compose logs v2-postgres --tail 30" \ --become -e "chdir=/opt/changemaker-lite" ``` Common causes: - Disk full (`df -h`) - Wrong `V2_POSTGRES_PASSWORD` (check vault.yml matches what's in the running DB) - First deploy: PostgreSQL needs time to initialize ### Health check fails after deploy API not responding on `/api/health`: ```bash # Check if container is running: ansible -m command -a "docker compose ps api" --become -e "chdir=/opt/changemaker-lite" # Check API logs: ansible -m command -a "docker compose logs api --tail 50" --become -e "chdir=/opt/changemaker-lite" ``` Common causes: - Missing environment variable (check `.env` generation) - Database migration failure (check Prisma output) - Port conflict (another process on 4000) ### .env has wrong values Compare generated `.env` with expected: ```bash # Show diff of what Ansible would change: ansible-playbook playbooks/configure.yml --limit --check --diff ``` ### Remote write not working (Tier 2) ```bash # Check Prometheus config on instance: ansible -m command -a "cat /opt/changemaker-lite/configs/prometheus/prometheus.yml" --become # Check Prometheus logs for remote write errors: ansible -m command -a "docker compose logs prometheus-changemaker --tail 30" \ --become -e "chdir=/opt/changemaker-lite" ``` Common issues: - `bunker_ops_enabled` not set to `true` - Wrong `bunker_ops_remote_write_url` - Invalid auth token - Central VictoriaMetrics not reachable (firewall, DNS) --- ## 9. Variable Reference ### Configuration variables (`cml_*`) Set these in `host_vars//main.yml` or `group_vars/`. | Variable | Default | Description | |----------|---------|-------------| | `cml_domain` | `cmlite.org` | Instance domain (drives CORS, SMTP, URLs) | | `cml_node_env` | `production` | Node.js environment | | `cml_api_port` | `4000` | Express API port | | `cml_admin_port` | `3000` | React admin port | | `cml_media_api_port` | `4100` | Fastify media API port | | `cml_postgres_port` | `5433` | PostgreSQL host port | | `cml_enable_media` | `"false"` | Enable video library | | `cml_enable_payments` | `"false"` | Enable Stripe payments | | `cml_enable_chat` | `"false"` | Enable Rocket.Chat | | `cml_listmonk_sync_enabled` | `"false"` | Enable newsletter sync | | `cml_gancio_sync_enabled` | `"false"` | Enable event sync | | `cml_email_test_mode` | `"true"` | Use MailHog (`true`) or SMTP (`false`) | | `cml_monitoring_enabled` | `false` | Enable Prometheus/Grafana stack | | `cml_smtp_host` | `mailhog-changemaker` | SMTP server hostname | | `cml_smtp_port` | `1025` | SMTP server port | | `cml_smtp_user` | `""` | SMTP username | | `cml_mapbox_api_key` | `""` | Mapbox geocoding key | | `cml_google_maps_api_key` | `""` | Google Maps geocoding key | | `cml_pangolin_api_url` | `""` | Pangolin tunnel API | | `cml_pangolin_org_id` | `""` | Pangolin organization | | `cml_backup_retention_days` | `30` | Days to keep local backups | | `cml_backup_cron_hour` | `3` | Backup cron hour (UTC) | | `cml_backup_s3_enabled` | `false` | Upload backups to S3 | | `bunker_ops_enabled` | `false` | Enable fleet observability | | `bunker_ops_instance_label` | `{{ cml_domain }}` | Label in central metrics | | `bunker_ops_remote_write_url` | `""` | VictoriaMetrics write endpoint | ### Secret variables (`vault_cml_*`) Set these in `host_vars//vault.yml` (encrypted). | Variable | Purpose | |----------|---------| | `vault_cml_v2_postgres_password` | PostgreSQL password | | `vault_cml_redis_password` | Redis authentication | | `vault_cml_jwt_access_secret` | JWT access token signing (64-char hex) | | `vault_cml_jwt_refresh_secret` | JWT refresh token signing (64-char hex) | | `vault_cml_encryption_key` | Database field encryption (64-char hex) | | `vault_cml_initial_admin_email` | Initial admin email | | `vault_cml_initial_admin_password` | Initial admin password (12+ chars, complexity) | | `vault_cml_listmonk_db_password` | Listmonk PostgreSQL password | | `vault_cml_listmonk_web_admin_password` | Listmonk web UI password | | `vault_cml_listmonk_api_token` | Listmonk API token | | `vault_cml_nocodb_admin_password` | NocoDB admin password | | `vault_cml_gitea_db_passwd` | Gitea database password | | `vault_cml_gitea_db_root_password` | Gitea DB root password | | `vault_cml_n8n_encryption_key` | n8n encryption key | | `vault_cml_n8n_user_password` | n8n admin password | | `vault_cml_grafana_admin_password` | Grafana admin password | | `vault_cml_gotify_admin_password` | Gotify admin password | | `vault_cml_vaultwarden_admin_token` | Vaultwarden admin token (64-char hex) | | `vault_cml_rocketchat_admin_password` | Rocket.Chat admin password | | `vault_cml_gancio_admin_password` | Gancio admin password | | `vault_cml_smtp_pass` | SMTP password | | `vault_cml_pangolin_api_key` | Pangolin API key | | `vault_cml_pangolin_newt_id` | Pangolin Newt container ID | | `vault_cml_pangolin_newt_secret` | Pangolin Newt secret | | `vault_cml_pangolin_site_id` | Pangolin site ID | | `vault_cml_pangolin_endpoint` | Pangolin endpoint URL | | `vault_bunker_ops_remote_write_token` | Central VM write auth token |