591 lines
16 KiB
Markdown
591 lines
16 KiB
Markdown
# V2 Architecture Overview
|
|
|
|
Changemaker Lite V2 is built on a modern microservices architecture with a dual API design, React admin interface, and comprehensive observability.
|
|
|
|
## System Architecture
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "User Access"
|
|
Browser[Web Browser]
|
|
VolunteerApp[Volunteer Mobile]
|
|
end
|
|
|
|
subgraph "Nginx Reverse Proxy"
|
|
Nginx[Nginx<br/>Subdomain Router]
|
|
end
|
|
|
|
subgraph "Frontend Layer"
|
|
AdminGUI[Admin GUI<br/>React + Vite + Ant Design<br/>Port 3000]
|
|
PublicPages[Public Pages<br/>Dark Theme]
|
|
VolunteerPortal[Volunteer Portal<br/>GPS Canvassing]
|
|
end
|
|
|
|
subgraph "Backend Layer - Dual API"
|
|
ExpressAPI[Express API<br/>Main Features<br/>Port 4000<br/>Prisma ORM]
|
|
FastifyAPI[Fastify API<br/>Media Library<br/>Port 4100<br/>Drizzle ORM]
|
|
end
|
|
|
|
subgraph "Data Layer"
|
|
Postgres[(PostgreSQL 16<br/>27+ Models)]
|
|
Redis[(Redis<br/>Cache + Queues)]
|
|
end
|
|
|
|
subgraph "Job Processing"
|
|
EmailQueue[BullMQ<br/>Email Queue]
|
|
GeocodeQueue[BullMQ<br/>Geocoding Queue]
|
|
end
|
|
|
|
subgraph "External Services"
|
|
SMTP[SMTP Server<br/>Email Delivery]
|
|
Represent[Represent API<br/>Canadian Reps]
|
|
Geocoding[Geocoding Providers<br/>6 Services]
|
|
Listmonk[Listmonk<br/>Newsletter Platform]
|
|
end
|
|
|
|
subgraph "Observability"
|
|
Prometheus[Prometheus<br/>Metrics]
|
|
Grafana[Grafana<br/>Dashboards]
|
|
Alertmanager[Alertmanager<br/>Notifications]
|
|
end
|
|
|
|
Browser --> Nginx
|
|
VolunteerApp --> Nginx
|
|
|
|
Nginx --> AdminGUI
|
|
Nginx --> PublicPages
|
|
Nginx --> VolunteerPortal
|
|
|
|
AdminGUI --> ExpressAPI
|
|
AdminGUI --> FastifyAPI
|
|
PublicPages --> ExpressAPI
|
|
VolunteerPortal --> ExpressAPI
|
|
|
|
ExpressAPI --> Postgres
|
|
ExpressAPI --> Redis
|
|
ExpressAPI --> EmailQueue
|
|
ExpressAPI --> GeocodeQueue
|
|
ExpressAPI --> Represent
|
|
ExpressAPI --> Geocoding
|
|
ExpressAPI --> Listmonk
|
|
ExpressAPI --> Prometheus
|
|
|
|
FastifyAPI --> Postgres
|
|
FastifyAPI --> Redis
|
|
FastifyAPI --> Prometheus
|
|
|
|
EmailQueue --> Redis
|
|
EmailQueue --> SMTP
|
|
GeocodeQueue --> Redis
|
|
GeocodeQueue --> Geocoding
|
|
|
|
Prometheus --> Grafana
|
|
Prometheus --> Alertmanager
|
|
```
|
|
|
|
## Core Components
|
|
|
|
### 1. Nginx Reverse Proxy
|
|
|
|
**Purpose**: Routes HTTP requests to appropriate services based on subdomain
|
|
|
|
**Subdomains**:
|
|
- `app.cmlite.org` → Admin GUI (React)
|
|
- `api.cmlite.org` → Express API (main features)
|
|
- `media.cmlite.org` → Fastify API (video library)
|
|
- `db.cmlite.org` → NocoDB (data browser)
|
|
- `docs.cmlite.org` → MkDocs (documentation)
|
|
- `listmonk.cmlite.org` → Listmonk (newsletter)
|
|
- `grafana.cmlite.org` → Grafana (monitoring)
|
|
- And 8 more service subdomains...
|
|
|
|
**Configuration**: `/nginx/conf.d/`
|
|
|
|
[Learn more →](networking.md)
|
|
|
|
### 2. Frontend Layer
|
|
|
|
#### Admin GUI (Port 3000)
|
|
- **Framework**: React 19 with Vite build tool
|
|
- **UI Library**: Ant Design 5 (Table, Form, Modal, Drawer components)
|
|
- **State Management**: Zustand stores (auth, canvass)
|
|
- **Routing**: React Router v6
|
|
- **HTTP Client**: Axios with 401 refresh interceptor
|
|
|
|
**Structure**:
|
|
- 32 admin pages (campaigns, locations, users, settings, etc.)
|
|
- 6 public pages (campaign view, response wall, map, shifts)
|
|
- 4 volunteer portal pages (canvassing, assignments, activity)
|
|
- Shared components (map, canvass, GrapesJS editor)
|
|
|
|
[Learn more →](frontend.md)
|
|
|
|
#### Public Pages
|
|
- Dark blue/teal theme (consistent with V1 branding)
|
|
- No authentication required
|
|
- Mobile-responsive layouts
|
|
- Public campaign submission
|
|
- Response wall with upvoting
|
|
- Public map with location markers
|
|
- Shift signup forms
|
|
|
|
#### Volunteer Portal
|
|
- Top navigation layout
|
|
- Mobile-optimized (hamburger menu)
|
|
- GPS-tracked canvassing
|
|
- Full-screen map interface
|
|
- Visit recording forms
|
|
- Activity tracking
|
|
|
|
### 3. Backend Layer - Dual API Design
|
|
|
|
#### Express API (Port 4000)
|
|
**Main application server** handling core features:
|
|
|
|
**14 Feature Modules**:
|
|
1. **auth** - JWT login, register, refresh, logout
|
|
2. **users** - User CRUD with pagination
|
|
3. **settings** - Site settings singleton
|
|
4. **campaigns** - Campaign CRUD + public routes
|
|
5. **representatives** - Represent API integration
|
|
6. **responses** - Response wall + moderation
|
|
7. **email-queue** - BullMQ queue admin
|
|
8. **campaign-emails** - Email tracking + stats
|
|
9. **postal-codes** - Postal code cache
|
|
10. **locations** - Location CRUD + geocoding + NAR import
|
|
11. **cuts** - Cut (polygon) CRUD + spatial queries
|
|
12. **shifts** - Shift CRUD + signups
|
|
13. **canvass** - Volunteer canvassing (sessions, visits, routes)
|
|
14. **pages** - Landing page builder (GrapesJS)
|
|
|
|
**Plus**: email-templates, listmonk, pangolin, docs, qr, services, observability
|
|
|
|
**ORM**: Prisma (27+ models)
|
|
|
|
**Architecture**:
|
|
- Layered structure (routes → services → database)
|
|
- Zod schema validation
|
|
- Role-based access control (RBAC)
|
|
- Error handling middleware
|
|
- Winston logging
|
|
|
|
[Learn more →](dual-api.md)
|
|
|
|
#### Fastify API (Port 4100)
|
|
**Specialized microservice** for media library:
|
|
|
|
**Features**:
|
|
- Video CRUD (title, duration, orientation, producer)
|
|
- Shared media (public gallery categories)
|
|
- Lock/unlock system (public visibility control)
|
|
- Reaction system (6 standard emojis)
|
|
- Job queue monitoring
|
|
- Bulk operations
|
|
|
|
**ORM**: Drizzle (lightweight schema-first)
|
|
|
|
**Why Separate?**:
|
|
- Performance isolation (video ops don't slow main API)
|
|
- Different ORM evaluation (Drizzle vs Prisma)
|
|
- Independent scaling
|
|
- Clear service boundaries
|
|
|
|
**Shared Resources**:
|
|
- Same PostgreSQL database (different schemas)
|
|
- Same Redis instance
|
|
- Reuses JWT_ACCESS_SECRET for auth
|
|
|
|
[Learn more →](dual-api.md)
|
|
|
|
### 4. Data Layer
|
|
|
|
#### PostgreSQL 16
|
|
**Primary database** with two ORM schemas:
|
|
|
|
**Prisma Schema** (27+ models):
|
|
- User, RefreshToken (auth)
|
|
- Campaign, Representative, Response, CampaignEmail (influence)
|
|
- Location, Cut, Shift, ShiftSignup (map)
|
|
- CanvassSession, CanvassVisit, TrackingSession, TrackPoint (canvass)
|
|
- LandingPage, PageBlock, EmailTemplate (content)
|
|
- SiteSettings, MapSettings (config)
|
|
|
|
**Drizzle Schema** (media tables):
|
|
- videos
|
|
- shared_media
|
|
- reactions
|
|
- jobs
|
|
|
|
**Indexes**: Optimized for common queries (userId, campaignId, cutId, etc.)
|
|
|
|
[Learn more →](database.md)
|
|
|
|
#### Redis
|
|
**Multi-purpose cache and queue backend**:
|
|
|
|
- **Caching**: Postal codes (7-day TTL), representatives
|
|
- **Rate Limiting**: Per-endpoint limits (Redis-backed)
|
|
- **BullMQ Queues**: Email sending, bulk geocoding
|
|
- **Sessions**: Future session storage (if needed)
|
|
|
|
**Authentication**: Required (`REDIS_PASSWORD` env var)
|
|
|
|
### 5. Job Processing
|
|
|
|
#### BullMQ Queues
|
|
**Async job processing** for long-running operations:
|
|
|
|
**Email Queue**:
|
|
- Campaign email sending (SMTP)
|
|
- Email verification (double opt-in)
|
|
- Confirmation emails (shift signups)
|
|
- Retry logic (exponential backoff)
|
|
- Rate limiting (avoid spam flagging)
|
|
|
|
**Geocoding Queue**:
|
|
- Bulk address geocoding
|
|
- Multi-provider fallback (6 services)
|
|
- Rate limit compliance (500 jobs/min)
|
|
- Result caching
|
|
|
|
**Queue Management**:
|
|
- Admin routes for pause/resume
|
|
- Job status monitoring
|
|
- Failed job inspection
|
|
- Queue metrics (Prometheus)
|
|
|
|
### 6. External Services
|
|
|
|
#### SMTP Server
|
|
Email delivery for:
|
|
- Campaign advocacy emails
|
|
- Email verification
|
|
- Password reset
|
|
- Shift confirmation
|
|
- Admin notifications
|
|
|
|
**Dev Mode**: MailHog captures emails (`EMAIL_TEST_MODE=true`)
|
|
|
|
#### Represent API
|
|
Canadian elected representative lookup:
|
|
- Postal code → MPs, MPPs, councillors
|
|
- Caching (7-day TTL per postal code)
|
|
- Fallback to cached data on API errors
|
|
|
|
#### Geocoding Providers
|
|
Multi-provider geocoding with fallback:
|
|
|
|
1. Nominatim (OpenStreetMap, free)
|
|
2. Mapbox (requires API key, best accuracy)
|
|
3. ArcGIS (free tier available)
|
|
4. Photon (OSM-based, no key required)
|
|
5. Google (requires API key, high cost)
|
|
6. LocationIQ (requires API key, generous free tier)
|
|
|
|
**Strategy**: Try each provider in order until success
|
|
|
|
#### Listmonk Newsletter Platform
|
|
Email marketing integration:
|
|
- Sync participants/locations/users → subscriber lists
|
|
- Newsletter campaigns (separate from advocacy emails)
|
|
- API integration (basic auth)
|
|
- Health monitoring
|
|
|
|
### 7. Observability Stack
|
|
|
|
#### Prometheus
|
|
**Metrics collection** with custom instrumentation:
|
|
|
|
**12 Custom Metrics** (`cm_*` prefix):
|
|
- `cm_api_uptime_seconds` - API availability
|
|
- `cm_email_queue_size` - Queue depth
|
|
- `cm_email_sent_total` - Email delivery count
|
|
- `cm_geocode_success_rate` - Geocoding quality
|
|
- `cm_active_canvass_sessions` - Live canvassing
|
|
- And 7 more domain-specific metrics...
|
|
|
|
**HTTP Metrics**:
|
|
- `http_request_total` - Total requests
|
|
- `http_request_duration_seconds` - Latency histogram
|
|
- `http_request_errors_total` - Error count
|
|
|
|
**Scrape Targets**:
|
|
- Express API (`:4000/metrics`)
|
|
- Fastify API (`:4100/metrics`)
|
|
- Redis Exporter
|
|
- Node Exporter (host metrics)
|
|
- cAdvisor (container metrics)
|
|
|
|
[Learn more →](monitoring.md)
|
|
|
|
#### Grafana
|
|
**Visualization dashboards**:
|
|
|
|
1. **Application Overview** - API metrics, queue stats, sessions
|
|
2. **Infrastructure** - Container metrics, host resources, Redis
|
|
3. **Alerts & SLOs** - Error budgets, SLI tracking
|
|
|
|
**Auto-provisioned**: Dashboards in `/configs/grafana/`
|
|
|
|
#### Alertmanager
|
|
**Alert routing and notifications**:
|
|
|
|
**12 Alert Rules**:
|
|
- High error rate (>5% for 5 minutes)
|
|
- Email queue stuck (no jobs processed in 10 minutes)
|
|
- Service down (health check fails)
|
|
- Database connection pool exhausted
|
|
- Redis unavailable
|
|
- And 7 more critical conditions...
|
|
|
|
**Notification Channels**:
|
|
- Gotify (self-hosted push notifications)
|
|
- Email (SMTP)
|
|
- Webhook (custom integrations)
|
|
|
|
## Request Lifecycle
|
|
|
|
### Example: Public Campaign Email Submission
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant User as User Browser
|
|
participant Nginx
|
|
participant Admin as Admin GUI
|
|
participant Express as Express API
|
|
participant DB as PostgreSQL
|
|
participant Redis
|
|
participant Queue as BullMQ
|
|
participant SMTP as SMTP Server
|
|
participant Rep as Represent API
|
|
|
|
User->>Nginx: Visit /campaigns/123
|
|
Nginx->>Admin: Route to React app
|
|
Admin->>Express: GET /api/campaigns/123 (public)
|
|
Express->>DB: Query campaign
|
|
DB-->>Express: Campaign data
|
|
Express-->>Admin: Campaign JSON
|
|
Admin-->>User: Render campaign page
|
|
|
|
User->>Admin: Enter postal code + submit
|
|
Admin->>Express: POST /api/postal-codes (lookup)
|
|
Express->>Redis: Check cache
|
|
Redis-->>Express: Cache miss
|
|
Express->>Rep: GET /representatives/postal-code
|
|
Rep-->>Express: Representative list
|
|
Express->>Redis: Cache for 7 days
|
|
Express-->>Admin: Representatives JSON
|
|
Admin-->>User: Show rep selection
|
|
|
|
User->>Admin: Select rep + write email + submit
|
|
Admin->>Express: POST /api/responses (create)
|
|
Express->>DB: Insert response
|
|
Express->>Queue: Enqueue verification email
|
|
Express->>DB: Insert campaign email record
|
|
DB-->>Express: Response created
|
|
Express-->>Admin: Success response
|
|
Admin-->>User: Show success message
|
|
|
|
Queue->>SMTP: Send verification email
|
|
SMTP-->>Queue: Delivery confirmed
|
|
|
|
User->>User: Click verification link (email)
|
|
User->>Nginx: GET /verify-response/:token
|
|
Nginx->>Admin: Route to React app
|
|
Admin->>Express: POST /api/responses/:id/verify
|
|
Express->>DB: Update response (verified=true)
|
|
Express->>Queue: Enqueue campaign email to rep
|
|
DB-->>Express: Response verified
|
|
Express-->>Admin: Success
|
|
Admin-->>User: Email sent confirmation
|
|
|
|
Queue->>SMTP: Send campaign email to rep
|
|
SMTP-->>Queue: Delivery confirmed
|
|
```
|
|
|
|
## Technology Decisions
|
|
|
|
### Why TypeScript?
|
|
- Type safety reduces runtime errors
|
|
- Better IDE support and autocomplete
|
|
- Easier refactoring
|
|
- Self-documenting code
|
|
|
|
### Why Prisma + Drizzle?
|
|
- **Prisma**: Great for complex models, migrations, auto-generated types
|
|
- **Drizzle**: Lightweight, perfect for simple media tables
|
|
- Evaluate both ORMs in production
|
|
|
|
### Why Dual API?
|
|
- **Separation of concerns**: Media ops isolated from core features
|
|
- **Performance**: Video processing doesn't block main API
|
|
- **Scalability**: Independent horizontal scaling
|
|
- **Technology evaluation**: Compare Express vs Fastify
|
|
|
|
### Why JWT over Sessions?
|
|
- Stateless (scales horizontally)
|
|
- No session storage overhead
|
|
- Works across multiple API servers
|
|
- Standard claims (iat, exp, sub)
|
|
|
|
### Why BullMQ over Bull?
|
|
- Better TypeScript support
|
|
- Improved performance
|
|
- Active maintenance
|
|
- Better documentation
|
|
|
|
### Why PostgreSQL over NoSQL?
|
|
- Complex relational data (campaigns, locations, users)
|
|
- ACID transactions (critical for email queue)
|
|
- Full-text search
|
|
- Spatial queries (PostGIS for future geo features)
|
|
|
|
## Deployment Architecture
|
|
|
|
### Docker Compose
|
|
All services orchestrated in `docker-compose.yml`:
|
|
|
|
**Profiles**:
|
|
- `default`: Core services (postgres, redis, api, admin, nginx)
|
|
- `monitoring`: Prometheus, Grafana, Alertmanager, exporters
|
|
|
|
**Networks**:
|
|
- `changemaker-lite` bridge network
|
|
- Service discovery via container names
|
|
|
|
**Volumes**:
|
|
- PostgreSQL data persistence
|
|
- Redis data persistence
|
|
- Uploads directory
|
|
- Logs directory
|
|
|
|
[Learn more →](../deployment/docker-compose.md)
|
|
|
|
### Nginx Routing
|
|
**Subdomain-based routing**:
|
|
|
|
```nginx
|
|
# Admin GUI
|
|
server {
|
|
server_name app.cmlite.org;
|
|
location / {
|
|
proxy_pass http://admin:3000;
|
|
}
|
|
}
|
|
|
|
# Express API
|
|
server {
|
|
server_name api.cmlite.org;
|
|
location / {
|
|
proxy_pass http://api:4000;
|
|
}
|
|
}
|
|
|
|
# Fastify Media API
|
|
server {
|
|
server_name media.cmlite.org;
|
|
location / {
|
|
proxy_pass http://media-api:4100;
|
|
}
|
|
}
|
|
```
|
|
|
|
[Learn more →](networking.md)
|
|
|
|
## Security Architecture
|
|
|
|
### Authentication Flow
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant Client
|
|
participant API as Express API
|
|
participant DB as PostgreSQL
|
|
participant Redis
|
|
|
|
Client->>API: POST /api/auth/login
|
|
API->>DB: Verify credentials
|
|
DB-->>API: User record
|
|
API->>DB: Create refresh token (expires 7d)
|
|
API->>Redis: Rate limit check
|
|
API-->>Client: Access token (15min) + Refresh token (7d)
|
|
|
|
Note over Client: Access token expires
|
|
|
|
Client->>API: POST /api/auth/refresh
|
|
API->>DB: Validate refresh token
|
|
DB-->>API: Token valid
|
|
API->>DB: Rotate refresh token (transaction)
|
|
API-->>Client: New access token + New refresh token
|
|
```
|
|
|
|
**Features**:
|
|
- bcrypt password hashing (12+ chars, complexity requirements)
|
|
- JWT access tokens (15min expiry)
|
|
- Refresh tokens (7 days, stored in DB, rotated on use)
|
|
- Rate limiting (10 requests/min on auth endpoints)
|
|
- User enumeration prevention (401 not 404)
|
|
- RBAC middleware (requireRole, requireNonTemp)
|
|
|
|
[Learn more →](authentication.md)
|
|
|
|
### Security Layers
|
|
1. **Network**: Nginx rate limiting, fail2ban
|
|
2. **Application**: Input validation (Zod schemas), RBAC
|
|
3. **Data**: Encrypted fields (ENCRYPTION_KEY), SQL injection prevention (Prisma)
|
|
4. **Transport**: HTTPS only (production), HSTS headers
|
|
|
|
[Learn more →](security.md)
|
|
|
|
## Scalability Considerations
|
|
|
|
### Horizontal Scaling
|
|
- **Stateless APIs**: JWT auth allows multiple API instances
|
|
- **Redis-backed queues**: Share job queues across workers
|
|
- **Database connection pooling**: Prisma manages connections
|
|
- **Nginx load balancing**: Distribute requests across API instances
|
|
|
|
### Vertical Scaling
|
|
- Increase container resources (CPU, memory)
|
|
- Optimize database queries (indexes, query planning)
|
|
- Redis memory limits (LRU eviction policy)
|
|
|
|
### Bottlenecks
|
|
- **PostgreSQL**: Single primary (future: read replicas)
|
|
- **Redis**: Single instance (future: Redis Cluster)
|
|
- **File uploads**: Local disk (future: S3-compatible storage)
|
|
|
|
## Monitoring & Observability
|
|
|
|
### Golden Signals
|
|
1. **Latency**: Request duration histograms
|
|
2. **Traffic**: Request rate by endpoint
|
|
3. **Errors**: Error rate (5xx responses)
|
|
4. **Saturation**: Database connections, Redis memory, queue depth
|
|
|
|
### SLOs (Service Level Objectives)
|
|
- **Availability**: 99.9% uptime (8.76 hours downtime/year)
|
|
- **Latency**: p95 < 500ms, p99 < 1000ms
|
|
- **Error Rate**: < 0.1% (1 error per 1000 requests)
|
|
|
|
### Alerting Strategy
|
|
- **Critical**: Page on-call (service down, database unavailable)
|
|
- **Warning**: Create ticket (queue growing, elevated errors)
|
|
- **Info**: Log only (slow query, cache miss)
|
|
|
|
[Learn more →](monitoring.md)
|
|
|
|
## Further Reading
|
|
|
|
- [Dual API Architecture](dual-api.md) - Express + Fastify design
|
|
- [Database Schema](database.md) - Complete ER diagram
|
|
- [Authentication Flow](authentication.md) - JWT security model
|
|
- [Frontend Architecture](frontend.md) - React + Vite + Ant Design
|
|
- [Networking](networking.md) - Nginx routing and subdomains
|
|
- [Security Model](security.md) - Comprehensive security audit
|
|
- [Monitoring Stack](monitoring.md) - Prometheus + Grafana + Alertmanager
|
|
- [Data Flow](data-flow.md) - Request lifecycle examples
|
|
|
|
---
|
|
|
|
**Next**: [Set up your development environment →](../development/local-setup.md)
|