1957 lines
35 KiB
Markdown

# Docker and Container Issues
This guide covers Docker-specific problems in Changemaker Lite V2.
## Overview
### Docker Troubleshooting Approach
1. **Check status** - Are containers running?
2. **Read logs** - What do container logs show?
3. **Inspect configuration** - Is docker-compose.yml correct?
4. **Test connectivity** - Can containers communicate?
5. **Resource check** - Enough CPU/memory/disk?
### Essential Docker Commands
```bash
# View running containers
docker compose ps
# View all containers (including stopped)
docker compose ps -a
# View logs
docker compose logs [service-name]
# Follow logs in real-time
docker compose logs -f [service-name]
# Execute command in container
docker compose exec [service-name] [command]
# Restart service
docker compose restart [service-name]
# Stop all services
docker compose down
# Start services
docker compose up -d
# Rebuild and start
docker compose up -d --build [service-name]
```
---
## Container Won't Start
### Port Already in Use
**Severity:** 🔴 Critical
#### Symptoms
```
Error response from daemon: driver failed programming external connectivity
on endpoint changemaker-lite-admin-1: Bind for 0.0.0.0:3000 failed:
port is already allocated
```
Or:
```
ERROR: for api Cannot start service api: Ports are not available:
exposing port TCP 0.0.0.0:4000 -> 0.0.0.0:0: listen tcp 0.0.0.0:4000:
bind: address already in use
```
#### Common Causes
1. **Another container using port** - Different Docker project
2. **Host process using port** - npm dev server running
3. **Previous container not stopped** - Old container still running
4. **Port conflict in docker-compose.yml** - Two services same port
#### Solutions
**Solution 1: Find what's using the port**
```bash
# Linux/Mac
sudo lsof -i :4000
# Or with netstat
netstat -tuln | grep :4000
# Windows
netstat -ano | findstr :4000
```
Output shows:
```
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
node 12345 user 23u IPv4 123456 0t0 TCP *:4000 (LISTEN)
```
**Solution 2: Stop conflicting process**
```bash
# Kill process by PID
kill 12345
# Or kill all node processes (careful!)
killall node
# Or stop other Docker containers
docker ps # List all running containers
docker stop container-name-or-id
```
**Solution 3: Change port in docker-compose.yml**
```yaml
# In docker-compose.yml
api:
ports:
- "4002:4000" # Changed from 4000:4000
```
Then:
```bash
# Restart with new port
docker compose up -d api
# Update .env to use new port
VITE_API_URL=http://localhost:4002
```
**Solution 4: Stop all and restart**
```bash
# Stop all Changemaker Lite containers
docker compose down
# Verify nothing running
docker compose ps
# Start fresh
docker compose up -d
```
#### Prevention
- **Use unique ports** - Avoid common ports (3000, 4000, 8000, 8080)
- **Stop properly** - Always use `docker compose down`
- **Check before start** - Run `docker compose ps` first
- **Document ports** - Keep port reference updated
---
### Volume Mount Errors
**Severity:** 🔴 Critical
#### Symptoms
```
Error response from daemon: invalid mount config for type "bind":
bind source path does not exist: /home/user/changemaker.lite/uploads
```
Or:
```
Error: EACCES: permission denied, open '/media/local/inbox/video.mp4'
```
#### Common Causes
1. **Path doesn't exist** - Directory not created
2. **Permission denied** - Container can't access directory
3. **Wrong path** - Typo in docker-compose.yml
4. **SELinux blocking** - Linux security policy
#### Solutions
**Solution 1: Create missing directories**
```bash
# Create all required directories
mkdir -p uploads
mkdir -p media/local/inbox
mkdir -p media/local/library
mkdir -p data
mkdir -p configs/prometheus
mkdir -p configs/grafana
# Verify they exist
ls -la
```
**Solution 2: Fix permissions**
```bash
# Make directories writable
chmod -R 777 uploads
chmod -R 777 media/local/inbox
# Or set ownership to container user
# Check container user ID
docker compose exec api id
# uid=1000(node) gid=1000(node)
# Set ownership
sudo chown -R 1000:1000 uploads
sudo chown -R 1000:1000 media
```
**Solution 3: Check volume configuration**
In `docker-compose.yml`:
```yaml
api:
volumes:
# Correct format:
- ./uploads:/app/uploads:rw # Read-write
- ./media:/media:ro # Read-only
# Wrong formats:
# - uploads:/app/uploads # Named volume, not bind mount
# - /uploads:/app/uploads # Absolute path on host
```
**Solution 4: Disable SELinux (last resort)**
```bash
# Check if SELinux is the issue
getenforce
# If "Enforcing":
# Option 1: Add :z flag to volume
# In docker-compose.yml:
- ./uploads:/app/uploads:z
# Option 2: Temporarily disable (not recommended)
sudo setenforce 0
```
**Solution 5: Verify mount inside container**
```bash
# Check if mount exists
docker compose exec api ls -la /app/uploads
# Check permissions
docker compose exec api ls -ld /app/uploads
# Try creating file
docker compose exec api touch /app/uploads/test.txt
```
#### Prevention
- **Create directories first** - Before `docker compose up`
- **Set permissions early** - In setup script
- **Use relative paths** - Start with `./` in docker-compose.yml
- **Document requirements** - List all required directories
---
### Missing Environment Variables
**Severity:** 🔴 Critical
#### Symptoms
Container logs show:
```
Error: DATABASE_URL is required
```
Or:
```
ZodError: [
{
"code": "invalid_type",
"expected": "string",
"received": "undefined",
"path": ["SMTP_HOST"],
"message": "Required"
}
]
```
Or container exits immediately:
```
changemaker-lite-api-1 exited with code 1
```
#### Common Causes
1. **.env not found** - Missing .env file
2. **Variable not set** - Missing required variable
3. **Wrong .env location** - .env not in project root
4. **Syntax error** - Malformed .env file
#### Solutions
**Solution 1: Check .env exists**
```bash
# Verify .env file
ls -la .env
# If missing, copy from example
cp .env.example .env
```
**Solution 2: Find missing variables**
```bash
# View container logs to see which variable
docker compose logs api | grep -i "required\|undefined"
# Example output:
# Error: SMTP_HOST is required
```
**Solution 3: Add missing variables**
```bash
# Edit .env
nano .env
# Add missing variable
SMTP_HOST=smtp.gmail.com
# Save and restart
docker compose restart api
```
**Solution 4: Validate .env format**
```bash
# Check for common issues:
# - No spaces around =
# - Quotes for values with spaces
# - No trailing commas
# - No comments on same line as value
# Good:
DATABASE_URL="postgresql://user:pass@host:5432/db"
CORS_ORIGINS=http://localhost:3000,http://localhost:4000
# Bad:
DATABASE_URL = "postgresql://..." # Space around =
CORS_ORIGINS=http://localhost:3000, http://localhost:4000 # Space after comma
SMTP_HOST=smtp.gmail.com # Gmail # Comment on same line
```
**Solution 5: Check which variables are loaded**
```bash
# View environment inside container
docker compose exec api env | grep -E "DATABASE_URL|SMTP_HOST|JWT_"
# Should show actual values (not undefined)
```
#### Prevention
- **Use .env.example** - Keep template updated
- **Validation on startup** - Zod validates env in `config/env.ts`
- **Documentation** - Document all required variables
- **Setup script** - Validate .env before starting
---
### Health Check Failures
**Severity:** 🟠 High
#### Symptoms
```bash
docker compose ps
```
Shows:
```
NAME STATUS
api Up 30 seconds (unhealthy)
v2-postgres Up 1 minute (healthy)
```
Or logs show:
```
Health check failed
```
#### Common Causes
1. **Service not ready** - Still starting up
2. **Health check endpoint failing** - /health returns error
3. **Timeout too short** - Service needs more time
4. **Dependencies not ready** - Database not connected
#### Solutions
**Solution 1: Check health check configuration**
In `docker-compose.yml`:
```yaml
api:
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:4000/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
```
**Solution 2: Test health endpoint manually**
```bash
# From inside container
docker compose exec api wget -O- http://localhost:4000/api/health
# Should return:
# {"status":"healthy","timestamp":"2026-02-13T..."}
# From host
curl http://localhost:4000/api/health
```
**Solution 3: View health check logs**
```bash
# Detailed health check output
docker inspect changemaker-lite-api-1 --format='{{json .State.Health}}' | jq
# Shows:
# {
# "Status": "unhealthy",
# "FailingStreak": 3,
# "Log": [
# {
# "Start": "2026-02-13T...",
# "End": "2026-02-13T...",
# "ExitCode": 1,
# "Output": "Error: Connection refused"
# }
# ]
# }
```
**Solution 4: Increase timeout/interval**
```yaml
api:
healthcheck:
interval: 60s # Check less frequently
timeout: 30s # Allow more time
start_period: 90s # Wait longer before first check
```
**Solution 5: Check service logs**
```bash
# Real issue is usually in service logs
docker compose logs api | tail -50
# Common issues:
# - Database connection failed
# - Missing environment variable
# - Port already in use
```
#### Prevention
- **Reasonable timeouts** - Allow enough time for startup
- **Accurate health checks** - Check actual readiness
- **Monitor health** - Alert on unhealthy containers
- **Dependencies** - Use `depends_on` with `condition: service_healthy`
---
## Container Crashes
### Out of Memory
**Severity:** 🔴 Critical
#### Symptoms
Container logs show:
```
<--- Last few GCs --->
[1:0x5588e4f8e000] 65432 ms: Mark-sweep 2048.0 (2048.4) -> 2047.9 (2048.4) MB, 1845.2 / 0.0 ms (average mu = 0.123, current mu = 0.001) allocation failure scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
```
Or:
```
Killed
```
Or `docker compose ps` shows:
```
api Exit 137
```
#### Common Causes
1. **Memory leak** - Application leaking memory
2. **Large dataset** - Processing too much data
3. **Too many connections** - Database connection pool too large
4. **Container limit** - Memory limit too low
#### Solutions
**Solution 1: Check memory usage**
```bash
# View container memory usage
docker stats
# Shows:
# CONTAINER CPU % MEM USAGE / LIMIT MEM %
# api 15.5% 1.2GiB / 2GiB 60%
```
**Solution 2: Increase Node.js heap size**
In `docker-compose.yml`:
```yaml
api:
environment:
- NODE_OPTIONS=--max-old-space-size=4096 # 4GB heap
```
Or in `api/package.json`:
```json
{
"scripts": {
"start": "node --max-old-space-size=4096 dist/server.js"
}
}
```
**Solution 3: Increase container memory limit**
```yaml
api:
deploy:
resources:
limits:
memory: 4G # Increase from 2G
reservations:
memory: 2G
```
**Solution 4: Find memory leak**
```bash
# Enable heap snapshots
docker compose exec api node --inspect dist/server.js
# Or use clinic.js
npm install -g clinic
clinic doctor -- node dist/server.js
```
**Solution 5: Reduce memory usage**
```typescript
// Reduce database connection pool
// In prisma/schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
// Add connection limit
}
// In DATABASE_URL:
DATABASE_URL="postgresql://...?connection_limit=5"
// Process data in batches
const users = await prisma.user.findMany({
take: 100, // Limit batch size
skip: offset
});
```
#### Prevention
- **Monitor memory** - Alert on high usage
- **Generous limits** - Set limits higher than expected usage
- **Memory profiling** - Regular memory audits
- **Optimize queries** - Reduce data fetched
---
### Application Errors
**Severity:** 🔴 Critical
#### Symptoms
Container exits immediately:
```
api-1 exited with code 1
```
Logs show:
```
Error: Cannot find module 'express'
```
Or:
```
SyntaxError: Unexpected token 'export'
```
#### Common Causes
1. **Missing dependencies** - npm install not run
2. **Build not run** - TypeScript not compiled
3. **Syntax error** - Code has errors
4. **Wrong Node version** - Incompatible Node.js version
#### Solutions
**Solution 1: Rebuild container**
```bash
# Rebuild with no cache
docker compose build --no-cache api
# Start
docker compose up -d api
# View logs
docker compose logs -f api
```
**Solution 2: Check dependencies**
```bash
# Verify package.json and package-lock.json exist
docker compose exec api ls -la package*.json
# Verify node_modules exists
docker compose exec api ls -la node_modules | head
# If missing, install
docker compose exec api npm install
```
**Solution 3: Verify build**
```bash
# Check if TypeScript compiled
docker compose exec api ls -la dist/
# If missing, build
docker compose exec api npm run build
# Or rebuild container
docker compose up -d --build api
```
**Solution 4: Check Node version**
```bash
# Check version in container
docker compose exec api node --version
# Should match Dockerfile
cat api/Dockerfile | grep "FROM node:"
# Example:
# FROM node:20-alpine
```
**Solution 5: Test locally**
```bash
# Test build locally
cd api
npm install
npm run build
npm start
# If works locally but not in Docker, check:
# - Dockerfile COPY commands
# - .dockerignore file
# - Volume mounts
```
#### Prevention
- **Multi-stage builds** - Separate build and runtime
- **Lock files** - Commit package-lock.json
- **CI/CD** - Automated build testing
- **Version pinning** - Pin Node.js version
---
### Database Connection Failures
**Severity:** 🔴 Critical
#### Symptoms
API logs show:
```
Error: Can't reach database server at `v2-postgres:5432`
Error: connect ECONNREFUSED 172.18.0.2:5432
```
Container restarts repeatedly.
#### Common Causes
1. **Database not ready** - API started before database
2. **Wrong host** - Incorrect database hostname
3. **Network issue** - Containers on different networks
4. **Database crashed** - PostgreSQL container down
#### Solutions
**Solution 1: Check database status**
```bash
# Is database running?
docker compose ps v2-postgres
# Should show "Up" status
# If not:
docker compose up -d v2-postgres
# Check logs
docker compose logs v2-postgres | tail -50
```
**Solution 2: Verify DATABASE_URL**
```bash
# Check .env
cat .env | grep DATABASE_URL
# From API container, should use container name:
DATABASE_URL="postgresql://changemaker:password@v2-postgres:5432/changemaker_v2"
# From host, use localhost:
DATABASE_URL="postgresql://changemaker:password@localhost:5433/changemaker_v2"
```
**Solution 3: Test database connection**
```bash
# From API container
docker compose exec api sh -c 'psql $DATABASE_URL -c "SELECT NOW();"'
# Should return current timestamp
# If fails, database connection is broken
```
**Solution 4: Check Docker network**
```bash
# List networks
docker network ls
# Inspect changemaker-lite network
docker network inspect changemaker-lite
# All containers should be on same network
```
**Solution 5: Use depends_on with health check**
In `docker-compose.yml`:
```yaml
api:
depends_on:
v2-postgres:
condition: service_healthy
# ...
v2-postgres:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U changemaker"]
interval: 10s
timeout: 5s
retries: 5
```
#### Prevention
- **Health checks** - Wait for database to be ready
- **Retry logic** - Retry connection on startup
- **Connection pooling** - Handle connection failures gracefully
- **Monitoring** - Alert on connection failures
---
## Networking Issues
### Containers Can't Communicate
**Severity:** 🔴 Critical
#### Symptoms
```
Error: getaddrinfo ENOTFOUND v2-postgres
```
Or:
```
Error: connect EHOSTUNREACH 172.18.0.2:5432
```
Containers can't ping each other.
#### Common Causes
1. **Different networks** - Containers on separate Docker networks
2. **Wrong hostname** - Using IP instead of container name
3. **Firewall** - Host firewall blocking
4. **DNS issue** - Docker DNS not working
#### Solutions
**Solution 1: Verify same network**
```bash
# Check container networks
docker inspect changemaker-lite-api-1 | grep NetworkMode
docker inspect changemaker-lite-v2-postgres-1 | grep NetworkMode
# Should both show "changemaker-lite"
```
**Solution 2: Use container names**
```yaml
# Correct - use service names
api:
environment:
- DATABASE_URL=postgresql://user:pass@v2-postgres:5432/db
# Wrong - using IPs
api:
environment:
- DATABASE_URL=postgresql://user:pass@172.18.0.2:5432/db
```
**Solution 3: Test connectivity**
```bash
# Ping from one container to another
docker compose exec api ping v2-postgres
# DNS lookup
docker compose exec api nslookup v2-postgres
# Telnet to port
docker compose exec api telnet v2-postgres 5432
```
**Solution 4: Recreate network**
```bash
# Stop all containers
docker compose down
# Remove network
docker network rm changemaker-lite
# Start fresh (network auto-created)
docker compose up -d
```
**Solution 5: Check firewall**
```bash
# Temporarily disable firewall (Linux)
sudo ufw disable
# Test if containers can communicate
# If yes, firewall is blocking
# Re-enable and add rules
sudo ufw enable
sudo ufw allow from 172.18.0.0/16 to any
```
#### Prevention
- **Use service names** - Never hardcode IPs
- **Single network** - All services on same network
- **Docker DNS** - Rely on Docker's built-in DNS
- **Health checks** - Verify connectivity on startup
---
### Port Not Accessible from Host
**Severity:** 🟠 High
#### Symptoms
From host:
```bash
curl http://localhost:4000/api/health
# curl: (7) Failed to connect to localhost port 4000: Connection refused
```
But from inside container:
```bash
docker compose exec api curl http://localhost:4000/api/health
# {"status":"healthy"}
```
#### Common Causes
1. **Port not published** - Missing `ports:` in docker-compose.yml
2. **Bound to 127.0.0.1** - Only listening on localhost inside container
3. **Firewall blocking** - Host firewall blocking port
4. **Wrong port** - Trying different port than published
#### Solutions
**Solution 1: Check port publishing**
In `docker-compose.yml`:
```yaml
api:
ports:
- "4000:4000" # host:container
```
Verify:
```bash
docker compose ps api
# Should show:
# PORTS: 0.0.0.0:4000->4000/tcp
```
**Solution 2: Bind to 0.0.0.0**
In `api/src/server.ts`:
```typescript
// Wrong - only localhost
app.listen(4000, '127.0.0.1');
// Right - all interfaces
app.listen(4000, '0.0.0.0');
// Or just
app.listen(4000); // Defaults to 0.0.0.0
```
**Solution 3: Check firewall**
```bash
# Check if port allowed (Linux)
sudo ufw status
# Allow port
sudo ufw allow 4000/tcp
# Or disable temporarily for testing
sudo ufw disable
```
**Solution 4: Verify correct port**
```bash
# Check what ports are actually listening
docker compose exec api netstat -tuln
# Should show:
# tcp6 0 0 :::4000 :::* LISTEN
```
**Solution 5: Restart with port forwarding**
```bash
# Stop container
docker compose stop api
# Remove container
docker compose rm -f api
# Start fresh
docker compose up -d api
# Verify port
curl http://localhost:4000/api/health
```
#### Prevention
- **Always publish ports** - In docker-compose.yml
- **Bind to 0.0.0.0** - Not 127.0.0.1
- **Test from host** - Verify accessibility
- **Document ports** - Keep port reference updated
---
### DNS Resolution Failures
**Severity:** 🟠 High
#### Symptoms
```
Error: getaddrinfo ENOTFOUND smtp.gmail.com
```
Or:
```
Error: getaddrinfo EAI_AGAIN api.represent.org
```
Container can't resolve external hostnames.
#### Common Causes
1. **Docker DNS issue** - Docker DNS not working
2. **No internet** - Container has no internet access
3. **Firewall blocking DNS** - Port 53 blocked
4. **Wrong DNS servers** - Using invalid DNS servers
#### Solutions
**Solution 1: Test DNS resolution**
```bash
# From inside container
docker compose exec api nslookup google.com
# Should return IP address
# If not, DNS is broken
```
**Solution 2: Check Docker DNS**
```bash
# View container DNS config
docker compose exec api cat /etc/resolv.conf
# Should show:
# nameserver 127.0.0.11 # Docker's embedded DNS
```
**Solution 3: Use custom DNS servers**
In `docker-compose.yml`:
```yaml
api:
dns:
- 8.8.8.8 # Google DNS
- 8.8.4.4
```
Or in `/etc/docker/daemon.json`:
```json
{
"dns": ["8.8.8.8", "8.8.4.4"]
}
```
Then restart Docker:
```bash
sudo systemctl restart docker
```
**Solution 4: Check internet connectivity**
```bash
# Ping external host
docker compose exec api ping -c 3 8.8.8.8
# If fails, no internet access
# Check host internet connection
ping -c 3 8.8.8.8
```
**Solution 5: Restart Docker daemon**
```bash
# Sometimes Docker DNS gets stuck
sudo systemctl restart docker
# Then restart containers
docker compose down
docker compose up -d
```
#### Prevention
- **Reliable DNS** - Use public DNS servers as backup
- **Monitor connectivity** - Alert on DNS failures
- **Health checks** - Include external connectivity checks
- **Retry logic** - Handle transient DNS failures
---
## Volume Issues
### Permission Denied
**Severity:** 🔴 Critical
#### Symptoms
```
Error: EACCES: permission denied, open '/app/uploads/image.jpg'
```
Or:
```
Error: EACCES: permission denied, mkdir '/media/local/inbox'
```
File operations fail inside container.
#### Common Causes
1. **Wrong ownership** - Host directory owned by different user
2. **Wrong permissions** - Directory not writable
3. **SELinux** - Linux security policy blocking
4. **Read-only mount** - Volume mounted as read-only
#### Solutions
**Solution 1: Check ownership**
```bash
# On host
ls -la uploads/
# Shows:
# drwxr-xr-x 2 root root 4096 Feb 13 10:00 uploads
# Check container user
docker compose exec api id
# uid=1000(node) gid=1000(node)
# Fix ownership
sudo chown -R 1000:1000 uploads/
```
**Solution 2: Fix permissions**
```bash
# Make writable
chmod -R 755 uploads/
# Or more permissive (dev only)
chmod -R 777 uploads/
```
**Solution 3: Check mount mode**
In `docker-compose.yml`:
```yaml
api:
volumes:
- ./uploads:/app/uploads:rw # Read-write
# Not:
# - ./uploads:/app/uploads:ro # Read-only
```
**Solution 4: SELinux labels**
```bash
# Add :z flag to volume
# In docker-compose.yml:
- ./uploads:/app/uploads:z
# Or relabel directory
sudo chcon -Rt svirt_sandbox_file_t uploads/
```
**Solution 5: Run as root (not recommended)**
```yaml
# In docker-compose.yml (last resort)
api:
user: "0:0" # Run as root
```
#### Prevention
- **Set permissions early** - In setup script
- **Match UIDs** - Container user matches host user
- **SELinux-aware** - Use :z flag on volumes
- **Document requirements** - List permission requirements
---
### Volume Not Mounted
**Severity:** 🟠 High
#### Symptoms
Container can't see files that exist on host.
```bash
# On host
ls uploads/
# image.jpg video.mp4
# In container
docker compose exec api ls /app/uploads/
# (empty)
```
#### Common Causes
1. **Wrong path** - Volume path incorrect
2. **Typo** - Syntax error in docker-compose.yml
3. **Not mounted** - Volume mount missing
4. **Cached old config** - Using old container
#### Solutions
**Solution 1: Verify volume configuration**
In `docker-compose.yml`:
```yaml
api:
volumes:
- ./uploads:/app/uploads # host:container
```
**Solution 2: Check mounts in running container**
```bash
# Inspect container mounts
docker inspect changemaker-lite-api-1 | grep -A 10 Mounts
# Should show:
# "Mounts": [
# {
# "Type": "bind",
# "Source": "/home/user/changemaker.lite/uploads",
# "Destination": "/app/uploads",
# "Mode": "",
# "RW": true,
# "Propagation": "rprivate"
# }
# ]
```
**Solution 3: Recreate container**
```bash
# Stop and remove container
docker compose down api
# Start fresh
docker compose up -d api
# Verify mount
docker compose exec api ls /app/uploads/
```
**Solution 4: Use absolute path**
```yaml
# Sometimes relative paths don't work
api:
volumes:
- /home/user/changemaker.lite/uploads:/app/uploads
```
**Solution 5: Check Docker Compose version**
```bash
# Check version
docker compose version
# Should be v2+
# If v1, syntax might differ
```
#### Prevention
- **Test mounts** - Verify after container start
- **Use relative paths** - Start with `./`
- **Documentation** - Document all volume mounts
- **Health checks** - Verify critical files exist
---
### Data Persistence Problems
**Severity:** 🔴 Critical
#### Symptoms
Data disappears after `docker compose down`:
- Database data lost
- Uploaded files missing
- Configuration reset
#### Common Causes
1. **Using containers, not volumes** - Data stored in container filesystem
2. **Anonymous volumes** - Volume not named or bound
3. **Deleting volumes** - `docker compose down -v` removes volumes
4. **Wrong volume type** - tmpfs instead of volume
#### Solutions
**Solution 1: Use named volumes**
In `docker-compose.yml`:
```yaml
v2-postgres:
volumes:
- postgres-data:/var/lib/postgresql/data # Named volume
volumes:
postgres-data: # Declare named volume
```
**Solution 2: Use bind mounts**
```yaml
v2-postgres:
volumes:
- ./data/postgres:/var/lib/postgresql/data # Bind to host directory
```
**Solution 3: Don't use -v flag**
```bash
# Wrong - deletes volumes
docker compose down -v
# Right - keeps volumes
docker compose down
```
**Solution 4: Check volume exists**
```bash
# List volumes
docker volume ls
# Should show:
# changemaker-lite_postgres-data
# Inspect volume
docker volume inspect changemaker-lite_postgres-data
```
**Solution 5: Backup before down**
```bash
# Backup database before stopping
docker compose exec v2-postgres pg_dump -U changemaker changemaker_v2 > backup.sql
# Then safe to:
docker compose down -v
# Restore after up:
docker compose up -d v2-postgres
docker compose exec -T v2-postgres psql -U changemaker changemaker_v2 < backup.sql
```
#### Prevention
- **Named volumes** - For all persistent data
- **Regular backups** - Automated backup script
- **Never use -v** - Unless intentionally resetting
- **Documentation** - Document what data persists where
---
## Performance Issues
### Slow Container Startup
**Severity:** 🟡 Medium
#### Symptoms
Container takes minutes to start:
```bash
docker compose up -d api
# Creating api ... (2 minutes)
# Creating api ... done
```
#### Common Causes
1. **Large image** - Downloading/extracting large image
2. **Many dependencies** - npm install taking long
3. **Health check delay** - Waiting for health checks
4. **Slow disk** - I/O bottleneck
#### Solutions
**Solution 1: Use pre-built image**
```yaml
# Instead of building locally
api:
build: ./api
# Use pre-built image from registry
api:
image: ghcr.io/yourorg/changemaker-api:latest
```
**Solution 2: Layer caching**
```dockerfile
# In Dockerfile, copy package files first
COPY package*.json ./
RUN npm ci
# Then copy code (changes more frequently)
COPY . .
RUN npm run build
```
**Solution 3: Multi-stage builds**
```dockerfile
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Runtime stage (smaller)
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
CMD ["node", "dist/server.js"]
```
**Solution 4: Increase Docker resources**
In Docker Desktop settings:
- CPU: 4+ cores
- Memory: 8GB+
- Disk: Fast SSD
**Solution 5: Parallel builds**
```bash
# Build all services in parallel
docker compose build --parallel
```
#### Prevention
- **Optimize Dockerfile** - Layer caching, multi-stage
- **Small base images** - Alpine instead of full images
- **Registry caching** - Pull from registry instead of building
- **Resource allocation** - Adequate CPU/memory for Docker
---
### High CPU Usage
**Severity:** 🟠 High
#### Symptoms
```bash
docker stats
# CONTAINER CPU %
# api 95%
```
Container consuming excessive CPU.
#### Common Causes
1. **Infinite loop** - Bug causing tight loop
2. **Heavy computation** - Processing large dataset
3. **Too many workers** - Worker threads maxed out
4. **Memory thrashing** - Swapping due to low memory
#### Solutions
**Solution 1: Identify process**
```bash
# Top inside container
docker compose exec api top
# Shows process using CPU
```
**Solution 2: Check for loops**
```bash
# View logs for repeated messages
docker compose logs api | tail -100
# Restart if stuck
docker compose restart api
```
**Solution 3: Limit worker threads**
```javascript
// In BullMQ worker
new Worker('queueName', processor, {
concurrency: 2, // Reduce from 10
limiter: {
max: 10,
duration: 1000 // Max 10 jobs per second
}
});
```
**Solution 4: Set CPU limits**
```yaml
api:
deploy:
resources:
limits:
cpus: '2.0' # Max 2 CPUs
```
**Solution 5: Profile application**
```bash
# Use Node.js profiler
docker compose exec api node --prof dist/server.js
# Or clinic.js
npm install -g clinic
clinic doctor -- node dist/server.js
```
#### Prevention
- **Monitor CPU** - Alert on high usage
- **Rate limiting** - Limit request rate
- **Queue management** - Control worker concurrency
- **Performance testing** - Load test regularly
---
### High Memory Usage
**Severity:** 🟠 High
#### Symptoms
```bash
docker stats
# CONTAINER MEM USAGE / LIMIT
# api 3.8GiB / 4GiB
```
Memory usage keeps increasing.
#### Common Causes
1. **Memory leak** - Not releasing memory
2. **Large cache** - Caching too much data
3. **Database connections** - Too many open connections
4. **Large response bodies** - Sending huge payloads
#### Solutions
**Solution 1: Identify memory usage**
```bash
# Memory breakdown inside container
docker compose exec api sh -c 'cat /proc/meminfo'
# Node.js heap stats
docker compose exec api node -e "console.log(process.memoryUsage())"
```
**Solution 2: Restart to free memory**
```bash
# Temporary fix
docker compose restart api
# Memory should drop
docker stats api
```
**Solution 3: Reduce cache size**
```typescript
// In Redis cache
redis.set(key, value, 'EX', 3600); // Expire after 1 hour
// Limit cache size
const cache = new LRU({
max: 1000, // Max 1000 entries
maxAge: 3600000 // 1 hour
});
```
**Solution 4: Set memory limit**
```yaml
api:
deploy:
resources:
limits:
memory: 2G # Hard limit
reservations:
memory: 1G # Reserved amount
```
**Solution 5: Find memory leak**
```bash
# Take heap snapshot
docker compose exec api node --expose-gc --inspect dist/server.js
# Use Chrome DevTools to analyze
# chrome://inspect
```
#### Prevention
- **Monitor memory** - Alert on high usage
- **Memory limits** - Prevent runaway processes
- **Regular restarts** - Restart daily if leaking
- **Memory profiling** - Profile in staging
---
## Useful Commands
### Viewing Logs
```bash
# Last 100 lines
docker compose logs api --tail=100
# Follow logs (real-time)
docker compose logs -f api
# All services
docker compose logs
# Since timestamp
docker compose logs --since="2026-02-13T10:00:00"
# Filter by keyword
docker compose logs api | grep -i error
# Save to file
docker compose logs api > api-logs.txt
```
### Executing Commands
```bash
# Run command in running container
docker compose exec api npm run migrate
# Interactive shell
docker compose exec api sh
# Run as different user
docker compose exec -u root api sh
# Run in new container (one-off)
docker compose run --rm api npm test
```
### Inspecting Containers
```bash
# View container details
docker inspect changemaker-lite-api-1
# View specific field
docker inspect changemaker-lite-api-1 --format='{{.State.Status}}'
# View environment variables
docker inspect changemaker-lite-api-1 --format='{{range .Config.Env}}{{println .}}{{end}}'
# View mounts
docker inspect changemaker-lite-api-1 --format='{{json .Mounts}}' | jq
```
### Container Management
```bash
# Start all services
docker compose up -d
# Start specific service
docker compose up -d api
# Stop all services
docker compose stop
# Stop specific service
docker compose stop api
# Restart service
docker compose restart api
# Remove stopped containers
docker compose rm
# Stop and remove
docker compose down
```
### Rebuilding
```bash
# Rebuild single service
docker compose build api
# Rebuild without cache
docker compose build --no-cache api
# Build all services
docker compose build
# Build and start
docker compose up -d --build
# Force recreate containers
docker compose up -d --force-recreate
```
---
## Log Analysis
### Reading Container Logs
Logs follow this pattern:
```
[timestamp] [level] [message]
2026-02-13T10:30:00.000Z INFO Server started on port 4000
```
### Common Log Patterns
**Successful startup:**
```
INFO Connecting to database...
INFO Database connected
INFO Registered route: GET /api/health
INFO Registered route: POST /api/auth/login
INFO Server started on port 4000
```
**Database connection error:**
```
INFO Connecting to database...
ERROR Can't reach database server at `v2-postgres:5432`
ERROR Retrying in 5 seconds...
```
**Missing environment variable:**
```
ERROR Environment validation failed:
ERROR SMTP_HOST is required
ERROR JWT_ACCESS_SECRET is required
```
**Health check failure:**
```
WARN Health check failed: Database not connected
```
### Filtering Logs
```bash
# Only errors
docker compose logs api | grep ERROR
# Only warnings and errors
docker compose logs api | grep -E "ERROR|WARN"
# Exclude health checks
docker compose logs api | grep -v "GET /api/health"
# Find specific request
docker compose logs api | grep "POST /api/users"
# Find by request ID
docker compose logs api | grep "req-abc123"
```
---
## Cleanup Commands
### Remove Stopped Containers
```bash
# Remove all stopped containers
docker compose down
# Remove specific service containers
docker compose rm api
# Force remove running containers
docker compose rm -f api
```
### Remove Images
```bash
# Remove all images for project
docker compose down --rmi all
# Remove only project-built images (not postgres, redis, etc.)
docker compose down --rmi local
# Remove specific image
docker rmi changemaker-lite-api
# Remove dangling images
docker image prune
```
### Remove Volumes
```bash
# ⚠️ WARNING: Deletes all data!
docker compose down -v
# Remove specific volume
docker volume rm changemaker-lite_postgres-data
# Remove unused volumes
docker volume prune
```
### Remove Networks
```bash
# Remove project network (containers must be stopped first)
docker network rm changemaker-lite
# Remove unused networks
docker network prune
```
### Full Cleanup
```bash
# ⚠️ DANGER: Removes everything!
docker compose down -v --rmi all
docker system prune -a --volumes
# This deletes:
# - All containers
# - All volumes (data lost!)
# - All images
# - All networks
# - All build cache
```
### Safe Cleanup
```bash
# Safe cleanup (keeps volumes)
docker compose down
docker image prune -a
docker network prune
# This keeps:
# - Volumes (data safe)
# - .env file
# - Application code
```
---
## Related Documentation
### Docker Documentation
- [Docker Issues](docker-issues.md) - This guide
- [Installation Guide](../user/installation.md) - Initial setup
- [Architecture Overview](../technical/architecture.md) - System design
### Other Troubleshooting
- [Common Errors](common-errors.md) - General errors
- [Database Issues](database-issues.md) - PostgreSQL problems
- [Monitoring Issues](monitoring-issues.md) - Observability problems
### Docker Resources
- [Docker Compose Reference](https://docs.docker.com/compose/)
- [Dockerfile Best Practices](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
- [Docker Networking](https://docs.docker.com/network/)
---
**Last Updated:** February 2026
**Version:** V2.0
**Status:** Complete