15 KiB
Bunker Ops — How-To Guide
Operational handbook for managing Changemaker Lite instances with Ansible.
Table of Contents
- Prerequisites
- Initial Setup (Control Machine)
- Adding a New Instance
- Deploying an Instance
- Day-to-Day Operations
- Secret Management
- Monitoring & Fleet Observability
- Troubleshooting
- Variable Reference
1. Prerequisites
Control Machine (your laptop / jump server)
- Ansible 2.14+ —
pip install ansibleorapt install ansible - SSH access — key-based auth to all target servers
- OpenSSL — for secret generation (
openssl rand)
Target Servers (each Changemaker instance)
- Ubuntu 22.04 or 24.04 (Debian-based)
- 2+ GB RAM (4 GB recommended; swap is auto-created on low-memory hosts)
- 20+ GB disk (50 GB recommended for media features)
- SSH access for a
deployuser with passwordless sudo - Outbound internet (pulls Docker images, Git repo)
- Ports 80, 443, and SSH accessible
2. Initial Setup (Control Machine)
2.1 Clone the repository
git clone <repo-url> changemaker.lite
cd changemaker.lite/bunker-ops
2.2 Create a vault password
This single password encrypts all per-instance secrets. Store it securely (password manager, not Git).
# Generate a strong vault password
openssl rand -base64 32 > .vault_pass
chmod 600 .vault_pass
The .vault_pass file is in .gitignore and must never be committed.
2.3 Verify Ansible can run
ansible --version
ansible-playbook playbooks/deploy.yml --syntax-check
2.4 Prepare SSH access
Ensure your SSH key can reach target servers:
# Test connectivity
ssh deploy@10.0.1.10 "hostname && docker --version"
If you use a non-default SSH key:
# In ansible.cfg or per-host
ansible_ssh_private_key_file: ~/.ssh/bunker_ops_ed25519
3. Adding a New Instance
3.1 Quick method (recommended)
The add-instance.sh script scaffolds everything:
./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10
# With fleet observability (Tier 2):
./scripts/add-instance.sh edmonton-prod betteredmonton.org 10.0.1.10 --tier 2
This creates:
inventory/host_vars/edmonton-prod/main.yml— instance configurationinventory/host_vars/edmonton-prod/vault.yml— 19+ generated secrets (encrypted)
3.2 Add to inventory
Edit inventory/hosts.yml and add the host:
all:
children:
changemaker_instances:
hosts:
edmonton-prod:
ansible_host: 10.0.1.10
ansible_user: deploy
cml_domain: betteredmonton.org
3.3 Customize configuration
Edit inventory/host_vars/edmonton-prod/main.yml:
cml_domain: betteredmonton.org
cml_node_env: production
# Enable features
cml_enable_media: "true"
cml_listmonk_sync_enabled: "true"
cml_email_test_mode: "false"
cml_monitoring_enabled: true
# Production SMTP
cml_smtp_host: smtp.protonmail.ch
cml_smtp_port: 587
cml_smtp_user: "noreply@betteredmonton.org"
# Pangolin tunnel
cml_pangolin_api_url: "https://api.bnkserve.org/v1"
cml_pangolin_org_id: "org_abc123"
3.4 Edit secrets (if needed)
# Decrypt, edit, re-encrypt
ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml
# Or set a specific value
ansible-vault decrypt inventory/host_vars/edmonton-prod/vault.yml
# ... edit ...
ansible-vault encrypt inventory/host_vars/edmonton-prod/vault.yml
3.5 Verify connectivity
ansible edmonton-prod -m ping
4. Deploying an Instance
4.1 Full initial deploy
Installs Docker, configures the OS, clones the repo, generates .env, starts all containers, runs migrations, and sets up backup cron:
ansible-playbook playbooks/deploy.yml --limit edmonton-prod
What happens (in order):
- common role — apt update, Docker install, UFW firewall, fail2ban, swap
- changemaker role — git clone, create dirs, generate
.env,docker compose up, Prisma migrations, seed, health checks, backup cron - monitoring role (if enabled) — Prometheus config,
--profile monitoring up
4.2 Deploy all instances
# One at a time (safe):
ansible-playbook playbooks/deploy.yml
# Show what would change (dry run):
ansible-playbook playbooks/deploy.yml --check --diff
4.3 Deploy with specific tags
# Only regenerate .env (no Docker restart):
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags env
# Only clone + update code:
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags clone
# Only run health checks:
ansible-playbook playbooks/deploy.yml --limit edmonton-prod --tags health
5. Day-to-Day Operations
5.1 Rolling upgrade (code + images)
Pulls latest Git commits, rebuilds images, runs migrations, restarts — in 25% batches:
# All instances:
ansible-playbook playbooks/upgrade.yml
# Single instance:
ansible-playbook playbooks/upgrade.yml --limit edmonton-prod
5.2 Configuration change (no rebuild)
Regenerates .env and restarts the API. Use when changing feature flags, SMTP settings, CORS origins, etc.:
# Change a variable first:
# Edit inventory/host_vars/edmonton-prod/main.yml
# e.g., cml_enable_media: "true"
# Then apply:
ansible-playbook playbooks/configure.yml --limit edmonton-prod
5.3 Trigger backups
# All instances:
ansible-playbook playbooks/backup.yml
# Single instance:
ansible-playbook playbooks/backup.yml --limit edmonton-prod
5.4 Enable/reconfigure monitoring
ansible-playbook playbooks/monitoring.yml --limit edmonton-prod
5.5 Run ad-hoc commands
# Check Docker status on all instances:
ansible changemaker_instances -m command -a "docker compose ps" --become
# View API logs on one instance:
ansible edmonton-prod -m command -a "docker compose logs api --tail 50" \
--become -e "chdir=/opt/changemaker-lite"
# Restart a specific service:
ansible edmonton-prod -m command -a "docker compose restart api" \
--become -e "chdir=/opt/changemaker-lite"
# Check disk space across fleet:
ansible changemaker_instances -m command -a "df -h /"
5.6 Rotate a secret
- Generate a new value:
openssl rand -hex 32 - Update the vault:
ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml # Change vault_cml_jwt_access_secret (or whichever secret) - Apply and restart:
ansible-playbook playbooks/configure.yml --limit edmonton-prod
6. Secret Management
Naming convention
| Prefix | Purpose | Example |
|---|---|---|
cml_* |
Non-secret configuration | cml_domain, cml_smtp_host |
vault_cml_* |
Encrypted secrets | vault_cml_v2_postgres_password |
vault_bunker_* |
Bunker Ops shared secrets | vault_bunker_ops_remote_write_token |
What gets encrypted
All 19+ secrets per instance:
- Database passwords (PostgreSQL, Redis, Listmonk DB, Gitea DB)
- JWT secrets (access + refresh) and encryption key
- Admin passwords (initial admin, NocoDB, n8n, Grafana, Gotify, Vaultwarden, Rocket.Chat, Gancio)
- API tokens (Listmonk API, Pangolin, Bunker Ops remote write)
- SMTP password
Vault operations
# View encrypted file:
ansible-vault view inventory/host_vars/edmonton-prod/vault.yml
# Edit in-place (decrypts → opens $EDITOR → re-encrypts):
ansible-vault edit inventory/host_vars/edmonton-prod/vault.yml
# Re-key all vaults (change master password):
find inventory/host_vars -name vault.yml -exec ansible-vault rekey {} +
# Encrypt a new plaintext file:
ansible-vault encrypt inventory/host_vars/new-instance/vault.yml
Vault password management
- The
.vault_passfile is referenced inansible.cfg - For CI/CD, pass via environment:
ANSIBLE_VAULT_PASSWORD=... ansible-playbook ... - For teams, use
--vault-password-filepointing to a shared secrets manager script
7. Monitoring & Fleet Observability
Tier model
| Tier | What it means | How to set |
|---|---|---|
| 0: Standalone | No Ansible management (manual config.sh install) |
N/A |
| 1: Managed | Ansible deploys/updates, local monitoring only | bunker_ops_enabled: false |
| 2: Fleet | Ansible + metrics pushed to central VictoriaMetrics | bunker_ops_enabled: true |
Enabling Tier 2 on an instance
- Set in
host_vars/<hostname>/main.yml:bunker_ops_enabled: true bunker_ops_remote_write_url: "https://ops.bnkserve.org/api/v1/write" cml_monitoring_enabled: true - Set the write token in
host_vars/<hostname>/vault.yml:vault_bunker_ops_remote_write_token: "your-token-here" - Apply:
ansible-playbook playbooks/monitoring.yml --limit edmonton-prod
What metrics are sent (Tier 2)
Only filtered, non-PII metrics leave the instance:
cm_*— Application metrics (emails sent, canvass visits, queue sizes, login attempts)node_*— System metrics (CPU, memory, disk, network)http_request*— API latency and request countsup— Service availability
Never sent: Database content, user data, campaign text, participant records, cAdvisor container details.
Backup metrics
When BUNKER_OPS_ENABLED=true, the backup script automatically pushes:
cm_backup_last_success_timestamp— Unix timestamp of last successful backupcm_backup_size_bytes— Size of the backup archive
These enable "backup staleness" alerts on the central dashboard.
8. Troubleshooting
Ansible can't connect
UNREACHABLE! => {"msg": "Failed to connect to the host via ssh"}
- Verify SSH:
ssh deploy@<host> hostname - Check
ansible_userin hosts.yml matches the SSH user - Ensure the user has passwordless sudo:
echo 'deploy ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/deploy
Vault password error
ERROR! Decryption failed on ...vault.yml
- Verify
.vault_passfile exists and is correct - Or pass explicitly:
ansible-playbook ... --vault-password-file /path/to/.vault_pass
Deploy fails at "Wait for PostgreSQL"
PostgreSQL hasn't started yet. Check:
ansible <host> -m command -a "docker compose logs v2-postgres --tail 30" \
--become -e "chdir=/opt/changemaker-lite"
Common causes:
- Disk full (
df -h) - Wrong
V2_POSTGRES_PASSWORD(check vault.yml matches what's in the running DB) - First deploy: PostgreSQL needs time to initialize
Health check fails after deploy
API not responding on /api/health:
# Check if container is running:
ansible <host> -m command -a "docker compose ps api" --become -e "chdir=/opt/changemaker-lite"
# Check API logs:
ansible <host> -m command -a "docker compose logs api --tail 50" --become -e "chdir=/opt/changemaker-lite"
Common causes:
- Missing environment variable (check
.envgeneration) - Database migration failure (check Prisma output)
- Port conflict (another process on 4000)
.env has wrong values
Compare generated .env with expected:
# Show diff of what Ansible would change:
ansible-playbook playbooks/configure.yml --limit <host> --check --diff
Remote write not working (Tier 2)
# Check Prometheus config on instance:
ansible <host> -m command -a "cat /opt/changemaker-lite/configs/prometheus/prometheus.yml" --become
# Check Prometheus logs for remote write errors:
ansible <host> -m command -a "docker compose logs prometheus-changemaker --tail 30" \
--become -e "chdir=/opt/changemaker-lite"
Common issues:
bunker_ops_enablednot set totrue- Wrong
bunker_ops_remote_write_url - Invalid auth token
- Central VictoriaMetrics not reachable (firewall, DNS)
9. Variable Reference
Configuration variables (cml_*)
Set these in host_vars/<hostname>/main.yml or group_vars/.
| Variable | Default | Description |
|---|---|---|
cml_domain |
cmlite.org |
Instance domain (drives CORS, SMTP, URLs) |
cml_node_env |
production |
Node.js environment |
cml_api_port |
4000 |
Express API port |
cml_admin_port |
3000 |
React admin port |
cml_media_api_port |
4100 |
Fastify media API port |
cml_postgres_port |
5433 |
PostgreSQL host port |
cml_enable_media |
"false" |
Enable video library |
cml_enable_payments |
"false" |
Enable Stripe payments |
cml_enable_chat |
"false" |
Enable Rocket.Chat |
cml_listmonk_sync_enabled |
"false" |
Enable newsletter sync |
cml_gancio_sync_enabled |
"false" |
Enable event sync |
cml_email_test_mode |
"true" |
Use MailHog (true) or SMTP (false) |
cml_monitoring_enabled |
false |
Enable Prometheus/Grafana stack |
cml_smtp_host |
mailhog-changemaker |
SMTP server hostname |
cml_smtp_port |
1025 |
SMTP server port |
cml_smtp_user |
"" |
SMTP username |
cml_mapbox_api_key |
"" |
Mapbox geocoding key |
cml_google_maps_api_key |
"" |
Google Maps geocoding key |
cml_pangolin_api_url |
"" |
Pangolin tunnel API |
cml_pangolin_org_id |
"" |
Pangolin organization |
cml_backup_retention_days |
30 |
Days to keep local backups |
cml_backup_cron_hour |
3 |
Backup cron hour (UTC) |
cml_backup_s3_enabled |
false |
Upload backups to S3 |
bunker_ops_enabled |
false |
Enable fleet observability |
bunker_ops_instance_label |
{{ cml_domain }} |
Label in central metrics |
bunker_ops_remote_write_url |
"" |
VictoriaMetrics write endpoint |
Secret variables (vault_cml_*)
Set these in host_vars/<hostname>/vault.yml (encrypted).
| Variable | Purpose |
|---|---|
vault_cml_v2_postgres_password |
PostgreSQL password |
vault_cml_redis_password |
Redis authentication |
vault_cml_jwt_access_secret |
JWT access token signing (64-char hex) |
vault_cml_jwt_refresh_secret |
JWT refresh token signing (64-char hex) |
vault_cml_encryption_key |
Database field encryption (64-char hex) |
vault_cml_initial_admin_email |
Initial admin email |
vault_cml_initial_admin_password |
Initial admin password (12+ chars, complexity) |
vault_cml_listmonk_db_password |
Listmonk PostgreSQL password |
vault_cml_listmonk_web_admin_password |
Listmonk web UI password |
vault_cml_listmonk_api_token |
Listmonk API token |
vault_cml_nocodb_admin_password |
NocoDB admin password |
vault_cml_gitea_db_passwd |
Gitea database password |
vault_cml_gitea_db_root_password |
Gitea DB root password |
vault_cml_n8n_encryption_key |
n8n encryption key |
vault_cml_n8n_user_password |
n8n admin password |
vault_cml_grafana_admin_password |
Grafana admin password |
vault_cml_gotify_admin_password |
Gotify admin password |
vault_cml_vaultwarden_admin_token |
Vaultwarden admin token (64-char hex) |
vault_cml_rocketchat_admin_password |
Rocket.Chat admin password |
vault_cml_gancio_admin_password |
Gancio admin password |
vault_cml_smtp_pass |
SMTP password |
vault_cml_pangolin_api_key |
Pangolin API key |
vault_cml_pangolin_newt_id |
Pangolin Newt container ID |
vault_cml_pangolin_newt_secret |
Pangolin Newt secret |
vault_cml_pangolin_site_id |
Pangolin site ID |
vault_cml_pangolin_endpoint |
Pangolin endpoint URL |
vault_bunker_ops_remote_write_token |
Central VM write auth token |