2026-02-18 17:15:31 -07:00

15 KiB

Bunker Ops — How-To Guide

Operational handbook for managing Changemaker Lite instances with Ansible.


Table of Contents

  1. Prerequisites
  2. Initial Setup (Control Machine)
  3. Adding a New Instance
  4. Deploying an Instance
  5. Day-to-Day Operations
  6. Secret Management
  7. Monitoring & Fleet Observability
  8. Troubleshooting
  9. Variable Reference

1. Prerequisites

Control Machine (your laptop / jump server)

  • Ansible 2.14+pip install ansible or apt install ansible
  • SSH access — key-based auth to all target servers
  • OpenSSL — for secret generation (openssl rand)

Target Servers (each Changemaker instance)

  • Ubuntu 22.04 or 24.04 (Debian-based)
  • 2+ GB RAM (4 GB recommended; swap is auto-created on low-memory hosts)
  • 20+ GB disk (50 GB recommended for media features)
  • SSH access for a deploy user with passwordless sudo
  • Outbound internet (pulls Docker images, Git repo)
  • Ports 80, 443, and SSH accessible

2. Initial Setup (Control Machine)

2.1 Clone the repository

git clone <repo-url> changemaker.lite
cd changemaker.lite/bunker-ops

2.2 Create a vault password

This single password encrypts all per-instance secrets. Store it securely (password manager, not Git).

# Generate a strong vault password
openssl rand -base64 32 > .vault_pass
chmod 600 .vault_pass

The .vault_pass file is in .gitignore and must never be committed.

2.3 Verify Ansible can run

ansible --version
ansible-playbook playbooks/deploy.yml --syntax-check

2.4 Prepare SSH access

Ensure your SSH key can reach target servers:

# Test connectivity
ssh deploy@10.0.1.10 "hostname && docker --version"

If you use a non-default SSH key:

# In ansible.cfg or per-host
ansible_ssh_private_key_file: ~/.ssh/bunker_ops_ed25519

3. Adding a New Instance

The add-instance.sh script scaffolds everything:

./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10

# With fleet observability (Tier 2):
./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10 --tier 2

This creates:

  • inventory/host_vars/edmonton-prod/main.yml — instance configuration
  • inventory/host_vars/edmonton-prod/vault.yml — 19+ generated secrets (encrypted)

3.2 Add to inventory

Edit inventory/hosts.yml and add the host:

all:
  children:
    changemaker_instances:
      hosts:
        edmonton-prod:
          ansible_host: 10.0.1.10
          ansible_user: deploy
          cml_domain: betteredmonton.org

3.3 Customize configuration

Edit inventory/host_vars/edmonton-prod/main.yml:

cml_domain: betteredmonton.org
cml_node_env: production

# Enable features
cml_enable_media: "true"
cml_listmonk_sync_enabled: "true"
cml_email_test_mode: "false"
cml_monitoring_enabled: true

# Production SMTP
cml_smtp_host: smtp.protonmail.ch
cml_smtp_port: 587
cml_smtp_user: "noreply@betteredmonton.org"

# Pangolin tunnel
cml_pangolin_api_url: "https://api.bnkserve.org/v1"
cml_pangolin_org_id: "org_abc123"

3.4 Edit secrets (if needed)

# Decrypt, edit, re-encrypt
ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml

# Or set a specific value
ansible-vault decrypt inventory/host_vars/edmonton-prod/vault.yml
# ... edit ...
ansible-vault encrypt inventory/host_vars/edmonton-prod/vault.yml

3.5 Verify connectivity

ansible edmonton-prod -m ping

4. Deploying an Instance

4.1 Full initial deploy

Installs Docker, configures the OS, clones the repo, generates .env, starts all containers, runs migrations, and sets up backup cron:

ansible-playbook playbooks/deploy.yml --limit edmonton-prod

What happens (in order):

  1. common role — apt update, Docker install, UFW firewall, fail2ban, swap
  2. changemaker role — git clone, create dirs, generate .env, docker compose up, Prisma migrations, seed, health checks, backup cron
  3. monitoring role (if enabled) — Prometheus config, --profile monitoring up

4.2 Deploy all instances

# One at a time (safe):
ansible-playbook playbooks/deploy.yml

# Show what would change (dry run):
ansible-playbook playbooks/deploy.yml --check --diff

4.3 Deploy with specific tags

# Only regenerate .env (no Docker restart):
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags env

# Only clone + update code:
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags clone

# Only run health checks:
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags health

5. Day-to-Day Operations

5.1 Rolling upgrade (code + images)

Pulls latest Git commits, rebuilds images, runs migrations, restarts — in 25% batches:

# All instances:
ansible-playbook playbooks/upgrade.yml

# Single instance:
ansible-playbook playbooks/upgrade.yml --limit edmonton-prod

5.2 Configuration change (no rebuild)

Regenerates .env and restarts the API. Use when changing feature flags, SMTP settings, CORS origins, etc.:

# Change a variable first:
# Edit inventory/host_vars/edmonton-prod/main.yml
# e.g., cml_enable_media: "true"

# Then apply:
ansible-playbook playbooks/configure.yml --limit edmonton-prod

5.3 Trigger backups

# All instances:
ansible-playbook playbooks/backup.yml

# Single instance:
ansible-playbook playbooks/backup.yml --limit edmonton-prod

5.4 Enable/reconfigure monitoring

ansible-playbook playbooks/monitoring.yml --limit edmonton-prod

5.5 Run ad-hoc commands

# Check Docker status on all instances:
ansible changemaker_instances -m command -a "docker compose ps" --become

# View API logs on one instance:
ansible edmonton-prod -m command -a "docker compose logs api --tail 50" \
  --become -e "chdir=/opt/changemaker-lite"

# Restart a specific service:
ansible edmonton-prod -m command -a "docker compose restart api" \
  --become -e "chdir=/opt/changemaker-lite"

# Check disk space across fleet:
ansible changemaker_instances -m command -a "df -h /"

5.6 Rotate a secret

  1. Generate a new value:
    openssl rand -hex 32
    
  2. Update the vault:
    ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml
    # Change vault_cml_jwt_access_secret (or whichever secret)
    
  3. Apply and restart:
    ansible-playbook playbooks/configure.yml --limit edmonton-prod
    

6. Secret Management

Naming convention

Prefix Purpose Example
cml_* Non-secret configuration cml_domain, cml_smtp_host
vault_cml_* Encrypted secrets vault_cml_v2_postgres_password
vault_bunker_* Bunker Ops shared secrets vault_bunker_ops_remote_write_token

What gets encrypted

All 19+ secrets per instance:

  • Database passwords (PostgreSQL, Redis, Listmonk DB, Gitea DB)
  • JWT secrets (access + refresh) and encryption key
  • Admin passwords (initial admin, NocoDB, n8n, Grafana, Gotify, Vaultwarden, Rocket.Chat, Gancio)
  • API tokens (Listmonk API, Pangolin, Bunker Ops remote write)
  • SMTP password

Vault operations

# View encrypted file:
ansible-vault view inventory/host_vars/edmonton-prod/vault.yml

# Edit in-place (decrypts → opens $EDITOR → re-encrypts):
ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml

# Re-key all vaults (change master password):
find inventory/host_vars -name vault.yml -exec ansible-vault rekey {} +

# Encrypt a new plaintext file:
ansible-vault encrypt inventory/host_vars/new-instance/vault.yml

Vault password management

  • The .vault_pass file is referenced in ansible.cfg
  • For CI/CD, pass via environment: ANSIBLE_VAULT_PASSWORD=... ansible-playbook ...
  • For teams, use --vault-password-file pointing to a shared secrets manager script

7. Monitoring & Fleet Observability

Tier model

Tier What it means How to set
0: Standalone No Ansible management (manual config.sh install) N/A
1: Managed Ansible deploys/updates, local monitoring only bunker_ops_enabled: false
2: Fleet Ansible + metrics pushed to central VictoriaMetrics bunker_ops_enabled: true

Enabling Tier 2 on an instance

  1. Set in host_vars/<hostname>/main.yml:
    bunker_ops_enabled: true
    bunker_ops_remote_write_url: "https://ops.bnkserve.org/api/v1/write"
    cml_monitoring_enabled: true
    
  2. Set the write token in host_vars/<hostname>/vault.yml:
    vault_bunker_ops_remote_write_token: "your-token-here"
    
  3. Apply:
    ansible-playbook playbooks/monitoring.yml --limit edmonton-prod
    

What metrics are sent (Tier 2)

Only filtered, non-PII metrics leave the instance:

  • cm_* — Application metrics (emails sent, canvass visits, queue sizes, login attempts)
  • node_* — System metrics (CPU, memory, disk, network)
  • http_request* — API latency and request counts
  • up — Service availability

Never sent: Database content, user data, campaign text, participant records, cAdvisor container details.

Backup metrics

When BUNKER_OPS_ENABLED=true, the backup script automatically pushes:

  • cm_backup_last_success_timestamp — Unix timestamp of last successful backup
  • cm_backup_size_bytes — Size of the backup archive

These enable "backup staleness" alerts on the central dashboard.


8. Troubleshooting

Ansible can't connect

UNREACHABLE! => {"msg": "Failed to connect to the host via ssh"}
  • Verify SSH: ssh deploy@<host> hostname
  • Check ansible_user in hosts.yml matches the SSH user
  • Ensure the user has passwordless sudo: echo 'deploy ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/deploy

Vault password error

ERROR! Decryption failed on ...vault.yml
  • Verify .vault_pass file exists and is correct
  • Or pass explicitly: ansible-playbook ... --vault-password-file /path/to/.vault_pass

Deploy fails at "Wait for PostgreSQL"

PostgreSQL hasn't started yet. Check:

ansible <host> -m command -a "docker compose logs v2-postgres --tail 30" \
  --become -e "chdir=/opt/changemaker-lite"

Common causes:

  • Disk full (df -h)
  • Wrong V2_POSTGRES_PASSWORD (check vault.yml matches what's in the running DB)
  • First deploy: PostgreSQL needs time to initialize

Health check fails after deploy

API not responding on /api/health:

# Check if container is running:
ansible <host> -m command -a "docker compose ps api" --become -e "chdir=/opt/changemaker-lite"

# Check API logs:
ansible <host> -m command -a "docker compose logs api --tail 50" --become -e "chdir=/opt/changemaker-lite"

Common causes:

  • Missing environment variable (check .env generation)
  • Database migration failure (check Prisma output)
  • Port conflict (another process on 4000)

.env has wrong values

Compare generated .env with expected:

# Show diff of what Ansible would change:
ansible-playbook playbooks/configure.yml --limit <host> --check --diff

Remote write not working (Tier 2)

# Check Prometheus config on instance:
ansible <host> -m command -a "cat /opt/changemaker-lite/configs/prometheus/prometheus.yml" --become

# Check Prometheus logs for remote write errors:
ansible <host> -m command -a "docker compose logs prometheus-changemaker --tail 30" \
  --become -e "chdir=/opt/changemaker-lite"

Common issues:

  • bunker_ops_enabled not set to true
  • Wrong bunker_ops_remote_write_url
  • Invalid auth token
  • Central VictoriaMetrics not reachable (firewall, DNS)

9. Variable Reference

Configuration variables (cml_*)

Set these in host_vars/<hostname>/main.yml or group_vars/.

Variable Default Description
cml_domain cmlite.org Instance domain (drives CORS, SMTP, URLs)
cml_node_env production Node.js environment
cml_api_port 4000 Express API port
cml_admin_port 3000 React admin port
cml_media_api_port 4100 Fastify media API port
cml_postgres_port 5433 PostgreSQL host port
cml_enable_media "false" Enable video library
cml_enable_payments "false" Enable Stripe payments
cml_enable_chat "false" Enable Rocket.Chat
cml_listmonk_sync_enabled "false" Enable newsletter sync
cml_gancio_sync_enabled "false" Enable event sync
cml_email_test_mode "true" Use MailHog (true) or SMTP (false)
cml_monitoring_enabled false Enable Prometheus/Grafana stack
cml_smtp_host mailhog-changemaker SMTP server hostname
cml_smtp_port 1025 SMTP server port
cml_smtp_user "" SMTP username
cml_mapbox_api_key "" Mapbox geocoding key
cml_google_maps_api_key "" Google Maps geocoding key
cml_pangolin_api_url "" Pangolin tunnel API
cml_pangolin_org_id "" Pangolin organization
cml_backup_retention_days 30 Days to keep local backups
cml_backup_cron_hour 3 Backup cron hour (UTC)
cml_backup_s3_enabled false Upload backups to S3
bunker_ops_enabled false Enable fleet observability
bunker_ops_instance_label {{ cml_domain }} Label in central metrics
bunker_ops_remote_write_url "" VictoriaMetrics write endpoint

Secret variables (vault_cml_*)

Set these in host_vars/<hostname>/vault.yml (encrypted).

Variable Purpose
vault_cml_v2_postgres_password PostgreSQL password
vault_cml_redis_password Redis authentication
vault_cml_jwt_access_secret JWT access token signing (64-char hex)
vault_cml_jwt_refresh_secret JWT refresh token signing (64-char hex)
vault_cml_encryption_key Database field encryption (64-char hex)
vault_cml_initial_admin_email Initial admin email
vault_cml_initial_admin_password Initial admin password (12+ chars, complexity)
vault_cml_listmonk_db_password Listmonk PostgreSQL password
vault_cml_listmonk_web_admin_password Listmonk web UI password
vault_cml_listmonk_api_token Listmonk API token
vault_cml_nocodb_admin_password NocoDB admin password
vault_cml_gitea_db_passwd Gitea database password
vault_cml_gitea_db_root_password Gitea DB root password
vault_cml_n8n_encryption_key n8n encryption key
vault_cml_n8n_user_password n8n admin password
vault_cml_grafana_admin_password Grafana admin password
vault_cml_gotify_admin_password Gotify admin password
vault_cml_vaultwarden_admin_token Vaultwarden admin token (64-char hex)
vault_cml_rocketchat_admin_password Rocket.Chat admin password
vault_cml_gancio_admin_password Gancio admin password
vault_cml_smtp_pass SMTP password
vault_cml_pangolin_api_key Pangolin API key
vault_cml_pangolin_newt_id Pangolin Newt container ID
vault_cml_pangolin_newt_secret Pangolin Newt secret
vault_cml_pangolin_site_id Pangolin site ID
vault_cml_pangolin_endpoint Pangolin endpoint URL
vault_bunker_ops_remote_write_token Central VM write auth token