The agent container runs as root but the bind-mounted instance directory
is owned by the host user (UID 1000 = `node` in the container). Modern
git refuses to operate on such repos without an explicit safe.directory
entry, breaking upgrade-check.sh's `git fetch/log` calls on source-installed
tenants. Verified empirically on soroush after the previous fix landed.
Bunker Admin
Three related fixes uncovered during a marcelle CCP registration test:
1. ccp-agent image was missing bash + curl + jq + python3, so every
spawn('bash', ...) in upgrade.routes.ts and backup.routes.ts failed
silently with ENOENT. CCP kept reading stale status.json files from
disk, masking that no agent had successfully checked for updates in
weeks. apk-add the missing tools.
2. ccp-agent's /app/instance mount was :ro, blocking the agent from
writing data/upgrade/status.json (and result/progress/backups).
Agent already has docker.sock — removing :ro is not a security
escalation. Patched both docker-compose.yml and docker-compose.prod.yml.
3. Gitea 1.23.x only initializes Release.CreatedUnix inside its
createTag() helper, which is skipped if the tag already exists on
origin. The old DEV_WORKFLOW pattern (push tag, then run
build-release.sh --upload) was triggering this — releases got
created_unix=0 and lost /releases/latest sort order to v2.9.14.
build-release.sh now removes the remote tag first and POSTs with
target_commitish so Gitea creates the tag and release atomically.
After these fixes, CCP's "Check for Updates" path returns truthful
data end-to-end (verified on marcelle: v2.9.15 -> v2.10.1, 1 behind).
Bunker Admin
Problem: the agent polled /poll every 30s while waiting for admin
approval. At 10 req/15min, the 11th poll hit 429 after ~5 min and
every subsequent one also failed — recovery required an agent
restart. A human-paced approval SLA is longer than 5 minutes.
CCP side (agents.routes.ts):
Split the one-size-fits-all agentRegistrationLimiter into two.
/register stays tight (10/15min — invite-code brute force is the
real attack surface). /poll gets a new agentPollLimiter at 180/15min
(one poll per ~5s upper bound), scoped to registrationId+slug so
blast radius is bounded.
Agent side (server.ts):
Replaced fixed 30s setInterval with a self-scheduling setTimeout
loop that backs off exponentially on HTTP 429 (30s → 60s → 120s →
300s cap) and resets to 30s on any 2xx. Stop-flag protects against
re-entry after approval. Fixes the "agent wedged at 429, restart to
recover" workaround.
Bunker Admin