14 Commits

Author SHA1 Message Date
a531f9b9ce fix(ccp): make agent functional + fix Gitea release timestamp bug
Three related fixes uncovered during a marcelle CCP registration test:

1. ccp-agent image was missing bash + curl + jq + python3, so every
   spawn('bash', ...) in upgrade.routes.ts and backup.routes.ts failed
   silently with ENOENT. CCP kept reading stale status.json files from
   disk, masking that no agent had successfully checked for updates in
   weeks. apk-add the missing tools.

2. ccp-agent's /app/instance mount was :ro, blocking the agent from
   writing data/upgrade/status.json (and result/progress/backups).
   Agent already has docker.sock — removing :ro is not a security
   escalation. Patched both docker-compose.yml and docker-compose.prod.yml.

3. Gitea 1.23.x only initializes Release.CreatedUnix inside its
   createTag() helper, which is skipped if the tag already exists on
   origin. The old DEV_WORKFLOW pattern (push tag, then run
   build-release.sh --upload) was triggering this — releases got
   created_unix=0 and lost /releases/latest sort order to v2.9.14.
   build-release.sh now removes the remote tag first and POSTs with
   target_commitish so Gitea creates the tag and release atomically.

After these fixes, CCP's "Check for Updates" path returns truthful
data end-to-end (verified on marcelle: v2.9.15 -> v2.10.1, 1 behind).

Bunker Admin
2026-05-20 11:59:35 -06:00
2ae7d8b968 Bug fixes for video serving and updats to documentation for mobile use screenshots 2026-04-30 14:17:50 -06:00
5968df5b42 release: scripts/*.sh whitelist parity check in build-release.sh
Replace the implicit runtime-scripts for-loop with two explicit arrays
(RUNTIME_SCRIPTS + DEV_ONLY_SCRIPTS) and a pre-build assertion that
aborts if any scripts/*.sh isn't classified.

Motivation: during the v2.9.8 sprint we added three new scripts
(pangolin-teardown.sh, ccp-deregister.sh, validate-env.sh) and had to
manually remember to include each in the whitelist. Missing one is
silent — the script just doesn't ship, and the release tarball is
subtly broken in a way you only notice when the docs reference the
missing script.

The parity check surfaces the decision: every new scripts/*.sh must
be deliberately classified as ship (RUNTIME_SCRIPTS) or don't-ship
(DEV_ONLY_SCRIPTS). Adding a script and forgetting aborts the build
with a clear message naming the unclassified file.

Side-effect fixes from this audit:
  - register-with-ccp.sh is now shipped (was only in source; needed
    for retrofitting CCP onto existing installs)
  - update-env.sh is now shipped (safe .env updater used by upgrade
    flows)

Bunker Admin
2026-04-16 15:17:54 -06:00
ce8c5aaf1f install: add scripts/ccp-deregister.sh + ship in tarball
Pairs with pangolin-teardown.sh. For instances that were phone-home
registered with a CCP, this script removes the CCP-side Instance row
during teardown so the slug is freed for re-registration.

Without it, tearing down a CCP-registered instance leaves a stale
Instance row in CCP's DB. The next phone-home-registration with the
same slug hits the unique-constraint violation we saw in marcelle
testing.

Matching: by agentUrl (default from .env CCP_AGENT_URL), --slug, or
--instance-id. Dry-run by default; --yes to execute. CCP admin token
via --token or CCP_ADMIN_TOKEN env var.

Added to build-release.sh whitelist so release tarballs include it
alongside pangolin-teardown.sh and validate-env.sh.

Bunker Admin
2026-04-16 13:07:12 -06:00
c2f12aa2bf release: refuse upload over existing tag unless --replace
scripts/build-release.sh --upload now checks for an existing release
at the given tag before POSTing a new one. If found and --replace is
not set, errors out with a clear message.

This prevents the silent-overwrite problem: a user on v2.9.7 running
./scripts/upgrade.sh sees "no update available" when the v2.9.7
release's tarball contents have silently changed. Version tags should
be immutable once published.

--replace is still available for deliberate test-bench iteration
(DELETEs the existing release, then POSTs). Documented as destructive
in the --help output and DEV_WORKFLOW.md.

Bunker Admin
2026-04-16 13:06:06 -06:00
f9d566bd84 install: preflight + teardown tooling + CCP tunnel cleanup on delete
Fixes surfaced by three rounds of fresh-install testing on marcelle:

- config.sh: add host-port preflight check (ss -tln) to catch
  cockpit-on-9090 style collisions before compose up; add
  --skip-port-check escape hatch; add --install-watcher /
  --no-install-watcher / --install-backup-timer /
  --no-install-backup-timer flags; -y --enable-all now installs both
  systemd units by default (previously silently skipped); print
  resolved admin email in Configuration Complete block.

- scripts/validate-env.sh: new section 5b "Host Port Availability"
  using ss-based detection, with process-name surfacing when run as
  root.

- scripts/pangolin-teardown.sh: new wrapper. Reads credentials from
  .env or takes --api-url/--api-key/--org-id flags. Dry-run by
  default; --yes to execute. Deletes resources before sites (avoids
  orphans). --keep-site-ids for safety.

- scripts/build-release.sh: include validate-env.sh and
  pangolin-teardown.sh in release tarball whitelist.

- CCP instances.service.ts: deleteInstance() now calls
  teardownTunnel() before composeDown when pangolinSiteId is set.
  Previously an admin clicking "Delete Instance" orphaned the
  Pangolin site + all its resources. Best-effort with try/catch
  matching the existing Docker-cleanup tolerance pattern.

- CLAUDE.md: sync drift — 44 → 50 migrations, 186 → 192 models,
  40 → 44 modules.

Bunker Admin
2026-04-16 12:50:48 -06:00
23df6a8b52 Fresh-install + upgrade-path hardening bundle
Six independent fixes surfaced during the v2.9.1 → v2.9.2 admin-UI
upgrade validation today. Together they make a clean install on a new
box work end-to-end without in-session patching.

- Fix 1: scripts/validate-compose-parity.sh + build-release.sh hook —
  fail release builds when api/admin/media-api/nginx healthcheck
  blocks drift between docker-compose.yml and docker-compose.prod.yml.
  Previous boot-race fix had to be applied to both files manually.

- Fix 2: scripts/systemd/install.sh chowns logs/ to the install user
  (the API container creates subdirs there as root, locking the
  host-side watcher out), pre-creates logs/upgrade-watcher.log, and
  changemaker-upgrade.service adds StartLimitIntervalSec=0 so a
  single transient failure can't wedge the .path unit permanently.

- Fix 3: /api/upgrade/status now returns a `watcher` sub-object that
  flags the host systemd watcher as stalled when trigger.json has
  been pending >30s. Admin SettingsPage SystemUpgradeTab renders a
  warning Alert with the systemctl recovery command when unhealthy.

- Fix 4: scripts/upgrade.sh write_result() — prefer head -1 VERSION
  over `git rev-parse HEAD` so release-mode upgrades report the new
  tag in result.json instead of "unknown".

- Fix 5: admin container healthcheck start_period 20s → 60s in both
  compose files, same class as the earlier api fix. Matches Gancio
  convention.

- Fix 7: /api/pangolin/sync now detects resources bound to a stale
  siteId (common after --pangolin-site new rotations), deletes and
  recreates them against the current site, and reports them under
  a new `reassigned` response field.

Bunker Admin
2026-04-15 11:57:50 -06:00
26ec925d9b CCP restore/tunnel/upgrade + upgrade.sh release-mode fixes + volunteer dashboard polish
- Add instance restore model, routes, and agent backup/restore endpoints
- Add Pangolin tunnel service (subdomain prefix, teardown action, CCP client)
- Add slug mutex for concurrent operation safety in agent
- Expand upgrade service with remote driver orchestration
- Fix upgrade.sh to properly handle release-mode installs (no git operations)
- Add CCP registration flags to config.sh (--ccp-url, --ccp-invite-code, --ccp-agent-url)
- Auto-detect JVB advertise IP in non-interactive mode
- Polish volunteer dashboard ActionStepsList with highlighted step component
- Add ticketed event description field + volunteer dashboard query refinements

Bunker Admin
2026-04-12 11:09:46 -06:00
36b709b911 Automate Gitea init, NocoDB auto sign-in, and fix prod compose
- Add scripts/gitea-init.sh: runs migrations + creates admin user on
  first boot, replacing the manual installation wizard
- Set GITEA__security__INSTALL_LOCK=true in both compose files
- Add NocoDB auth bridge (nginx) + /api/services/nocodb-auth proxy
  endpoint so the admin iframe auto-authenticates
- Update NocoDBPage.tsx to fetch token and use auth bridge flow
- Fix docker-compose.prod.yml missing Gitea env vars for API container
  (GITEA_URL, GITEA_API_TOKEN, GITEA_ADMIN_PASSWORD, etc.)
- Pass NC_ADMIN_EMAIL/PASSWORD to API for NocoDB auth proxy
- Increase Gitea auto-setup retries from 3 to 6 with admin auth check
- Update config.sh non-interactive mode to set GITEA_ADMIN_USER
- Include gitea-init.sh in release tarball (build-release.sh)

Bunker Admin
2026-04-09 12:49:33 -06:00
cbfa4f9e28 Add uninstall.sh and test-deployment.sh to release tarball
Bunker Admin
2026-04-07 14:14:45 -06:00
f378db89b5 Separate local vs remote Gitea API tokens to prevent credential collision
GITEA_API_TOKEN is for the local platform Gitea (docs comments, user
provisioning, SSO). New GITEA_REGISTRY_API_TOKEN is for the remote
registry at gitea.bnkops.com (release uploads via build-release.sh).

Previously both contexts shared one variable, causing auth failures
when the token for one instance was used against the other.

Bunker Admin
2026-03-31 11:53:20 -06:00
7287328148 Harden install pipeline: health checks, log rotation, backup timer
- install.sh: Add Docker daemon check, 10GB disk space pre-flight,
  error handling on pull/up, post-startup health polling with crash
  detection, cleanup trap on failure
- docker-compose: Fix nginx/listmonk depends_on to service_healthy,
  add x-logging anchor (10m/3 files) to all ~39 services
- config.sh: Preserve existing secrets on re-run (reconfigure mode),
  add automated daily backup timer (systemd, 02:00, 30-day retention)
- mirror-images.sh: Fix gotify source tag (2.9.0 not v2.9.0)
- build-release.sh: Ensure mkdocs/docs and mkdocs/overrides dirs exist
- .env.example: Add COMPOSE_PROFILES variable

Bunker Admin
2026-03-25 19:33:11 -06:00
44931260c4 Fix build-release.sh Gitea URL for host-side uploads
GITEA_URL points to the internal Docker hostname (gitea-changemaker:3000),
which is unreachable from the host. Derive external URL from GITEA_REGISTRY
instead, which already contains the external hostname.

Bunker Admin
2026-03-23 14:07:36 -06:00
8e6f0996de Add pre-built image installer and release tarball system
New install method: curl one-liner downloads a lightweight release
tarball (~9 MB) and runs the config wizard. No git clone needed,
no TypeScript compilation — pulls pre-built images from Gitea registry.

- docker-compose.prod.yml: production compose without build blocks or
  source code volume mounts; IMAGE_TAG defaults to latest
- scripts/install.sh: curl-friendly installer (downloads tarball,
  extracts, runs config.sh)
- scripts/build-release.sh: creates release tarball from dev repo
  with only runtime files (configs, scripts, docs, empty data dirs)
- config.sh: release-mode detection (VERSION file + no .git dir),
  auto-sets IMAGE_TAG=latest and NODE_ENV=production
- upgrade.sh: release-mode upgrade path (downloads new tarball from
  Gitea Releases API instead of git pull, always uses registry mode)
- upgrade-check.sh: release-mode version check via Gitea API
- .gitignore: exclude releases/ and api/dist/
- Docs: updated getting-started with pre-built install instructions

Bunker Admin
2026-03-22 20:34:49 -06:00