# V2 Architecture Overview Changemaker Lite V2 is built on a modern microservices architecture with a dual API design, React admin interface, and comprehensive observability. ## System Architecture ```mermaid graph TB subgraph "User Access" Browser[Web Browser] VolunteerApp[Volunteer Mobile] end subgraph "Nginx Reverse Proxy" Nginx[Nginx
Subdomain Router] end subgraph "Frontend Layer" AdminGUI[Admin GUI
React + Vite + Ant Design
Port 3000] PublicPages[Public Pages
Dark Theme] VolunteerPortal[Volunteer Portal
GPS Canvassing] end subgraph "Backend Layer - Dual API" ExpressAPI[Express API
Main Features
Port 4000
Prisma ORM] FastifyAPI[Fastify API
Media Library
Port 4100
Drizzle ORM] end subgraph "Data Layer" Postgres[(PostgreSQL 16
27+ Models)] Redis[(Redis
Cache + Queues)] end subgraph "Job Processing" EmailQueue[BullMQ
Email Queue] GeocodeQueue[BullMQ
Geocoding Queue] end subgraph "External Services" SMTP[SMTP Server
Email Delivery] Represent[Represent API
Canadian Reps] Geocoding[Geocoding Providers
6 Services] Listmonk[Listmonk
Newsletter Platform] end subgraph "Observability" Prometheus[Prometheus
Metrics] Grafana[Grafana
Dashboards] Alertmanager[Alertmanager
Notifications] end Browser --> Nginx VolunteerApp --> Nginx Nginx --> AdminGUI Nginx --> PublicPages Nginx --> VolunteerPortal AdminGUI --> ExpressAPI AdminGUI --> FastifyAPI PublicPages --> ExpressAPI VolunteerPortal --> ExpressAPI ExpressAPI --> Postgres ExpressAPI --> Redis ExpressAPI --> EmailQueue ExpressAPI --> GeocodeQueue ExpressAPI --> Represent ExpressAPI --> Geocoding ExpressAPI --> Listmonk ExpressAPI --> Prometheus FastifyAPI --> Postgres FastifyAPI --> Redis FastifyAPI --> Prometheus EmailQueue --> Redis EmailQueue --> SMTP GeocodeQueue --> Redis GeocodeQueue --> Geocoding Prometheus --> Grafana Prometheus --> Alertmanager ``` ## Core Components ### 1. Nginx Reverse Proxy **Purpose**: Routes HTTP requests to appropriate services based on subdomain **Subdomains**: - `app.cmlite.org` → Admin GUI (React) - `api.cmlite.org` → Express API (main features) - `media.cmlite.org` → Fastify API (video library) - `db.cmlite.org` → NocoDB (data browser) - `docs.cmlite.org` → MkDocs (documentation) - `listmonk.cmlite.org` → Listmonk (newsletter) - `grafana.cmlite.org` → Grafana (monitoring) - And 8 more service subdomains... **Configuration**: `/nginx/conf.d/` [Learn more →](networking.md) ### 2. Frontend Layer #### Admin GUI (Port 3000) - **Framework**: React 19 with Vite build tool - **UI Library**: Ant Design 5 (Table, Form, Modal, Drawer components) - **State Management**: Zustand stores (auth, canvass) - **Routing**: React Router v6 - **HTTP Client**: Axios with 401 refresh interceptor **Structure**: - 32 admin pages (campaigns, locations, users, settings, etc.) - 6 public pages (campaign view, response wall, map, shifts) - 4 volunteer portal pages (canvassing, assignments, activity) - Shared components (map, canvass, GrapesJS editor) [Learn more →](frontend.md) #### Public Pages - Dark blue/teal theme (consistent with V1 branding) - No authentication required - Mobile-responsive layouts - Public campaign submission - Response wall with upvoting - Public map with location markers - Shift signup forms #### Volunteer Portal - Top navigation layout - Mobile-optimized (hamburger menu) - GPS-tracked canvassing - Full-screen map interface - Visit recording forms - Activity tracking ### 3. Backend Layer - Dual API Design #### Express API (Port 4000) **Main application server** handling core features: **14 Feature Modules**: 1. **auth** - JWT login, register, refresh, logout 2. **users** - User CRUD with pagination 3. **settings** - Site settings singleton 4. **campaigns** - Campaign CRUD + public routes 5. **representatives** - Represent API integration 6. **responses** - Response wall + moderation 7. **email-queue** - BullMQ queue admin 8. **campaign-emails** - Email tracking + stats 9. **postal-codes** - Postal code cache 10. **locations** - Location CRUD + geocoding + NAR import 11. **cuts** - Cut (polygon) CRUD + spatial queries 12. **shifts** - Shift CRUD + signups 13. **canvass** - Volunteer canvassing (sessions, visits, routes) 14. **pages** - Landing page builder (GrapesJS) **Plus**: email-templates, listmonk, pangolin, docs, qr, services, observability **ORM**: Prisma (27+ models) **Architecture**: - Layered structure (routes → services → database) - Zod schema validation - Role-based access control (RBAC) - Error handling middleware - Winston logging [Learn more →](dual-api.md) #### Fastify API (Port 4100) **Specialized microservice** for media library: **Features**: - Video CRUD (title, duration, orientation, producer) - Shared media (public gallery categories) - Lock/unlock system (public visibility control) - Reaction system (6 standard emojis) - Job queue monitoring - Bulk operations **ORM**: Drizzle (lightweight schema-first) **Why Separate?**: - Performance isolation (video ops don't slow main API) - Different ORM evaluation (Drizzle vs Prisma) - Independent scaling - Clear service boundaries **Shared Resources**: - Same PostgreSQL database (different schemas) - Same Redis instance - Reuses JWT_ACCESS_SECRET for auth [Learn more →](dual-api.md) ### 4. Data Layer #### PostgreSQL 16 **Primary database** with two ORM schemas: **Prisma Schema** (27+ models): - User, RefreshToken (auth) - Campaign, Representative, Response, CampaignEmail (influence) - Location, Cut, Shift, ShiftSignup (map) - CanvassSession, CanvassVisit, TrackingSession, TrackPoint (canvass) - LandingPage, PageBlock, EmailTemplate (content) - SiteSettings, MapSettings (config) **Drizzle Schema** (media tables): - videos - shared_media - reactions - jobs **Indexes**: Optimized for common queries (userId, campaignId, cutId, etc.) [Learn more →](database.md) #### Redis **Multi-purpose cache and queue backend**: - **Caching**: Postal codes (7-day TTL), representatives - **Rate Limiting**: Per-endpoint limits (Redis-backed) - **BullMQ Queues**: Email sending, bulk geocoding - **Sessions**: Future session storage (if needed) **Authentication**: Required (`REDIS_PASSWORD` env var) ### 5. Job Processing #### BullMQ Queues **Async job processing** for long-running operations: **Email Queue**: - Campaign email sending (SMTP) - Email verification (double opt-in) - Confirmation emails (shift signups) - Retry logic (exponential backoff) - Rate limiting (avoid spam flagging) **Geocoding Queue**: - Bulk address geocoding - Multi-provider fallback (6 services) - Rate limit compliance (500 jobs/min) - Result caching **Queue Management**: - Admin routes for pause/resume - Job status monitoring - Failed job inspection - Queue metrics (Prometheus) ### 6. External Services #### SMTP Server Email delivery for: - Campaign advocacy emails - Email verification - Password reset - Shift confirmation - Admin notifications **Dev Mode**: MailHog captures emails (`EMAIL_TEST_MODE=true`) #### Represent API Canadian elected representative lookup: - Postal code → MPs, MPPs, councillors - Caching (7-day TTL per postal code) - Fallback to cached data on API errors #### Geocoding Providers Multi-provider geocoding with fallback: 1. Nominatim (OpenStreetMap, free) 2. Mapbox (requires API key, best accuracy) 3. ArcGIS (free tier available) 4. Photon (OSM-based, no key required) 5. Google (requires API key, high cost) 6. LocationIQ (requires API key, generous free tier) **Strategy**: Try each provider in order until success #### Listmonk Newsletter Platform Email marketing integration: - Sync participants/locations/users → subscriber lists - Newsletter campaigns (separate from advocacy emails) - API integration (basic auth) - Health monitoring ### 7. Observability Stack #### Prometheus **Metrics collection** with custom instrumentation: **12 Custom Metrics** (`cm_*` prefix): - `cm_api_uptime_seconds` - API availability - `cm_email_queue_size` - Queue depth - `cm_email_sent_total` - Email delivery count - `cm_geocode_success_rate` - Geocoding quality - `cm_active_canvass_sessions` - Live canvassing - And 7 more domain-specific metrics... **HTTP Metrics**: - `http_request_total` - Total requests - `http_request_duration_seconds` - Latency histogram - `http_request_errors_total` - Error count **Scrape Targets**: - Express API (`:4000/metrics`) - Fastify API (`:4100/metrics`) - Redis Exporter - Node Exporter (host metrics) - cAdvisor (container metrics) [Learn more →](monitoring.md) #### Grafana **Visualization dashboards**: 1. **Application Overview** - API metrics, queue stats, sessions 2. **Infrastructure** - Container metrics, host resources, Redis 3. **Alerts & SLOs** - Error budgets, SLI tracking **Auto-provisioned**: Dashboards in `/configs/grafana/` #### Alertmanager **Alert routing and notifications**: **12 Alert Rules**: - High error rate (>5% for 5 minutes) - Email queue stuck (no jobs processed in 10 minutes) - Service down (health check fails) - Database connection pool exhausted - Redis unavailable - And 7 more critical conditions... **Notification Channels**: - Gotify (self-hosted push notifications) - Email (SMTP) - Webhook (custom integrations) ## Request Lifecycle ### Example: Public Campaign Email Submission ```mermaid sequenceDiagram participant User as User Browser participant Nginx participant Admin as Admin GUI participant Express as Express API participant DB as PostgreSQL participant Redis participant Queue as BullMQ participant SMTP as SMTP Server participant Rep as Represent API User->>Nginx: Visit /campaigns/123 Nginx->>Admin: Route to React app Admin->>Express: GET /api/campaigns/123 (public) Express->>DB: Query campaign DB-->>Express: Campaign data Express-->>Admin: Campaign JSON Admin-->>User: Render campaign page User->>Admin: Enter postal code + submit Admin->>Express: POST /api/postal-codes (lookup) Express->>Redis: Check cache Redis-->>Express: Cache miss Express->>Rep: GET /representatives/postal-code Rep-->>Express: Representative list Express->>Redis: Cache for 7 days Express-->>Admin: Representatives JSON Admin-->>User: Show rep selection User->>Admin: Select rep + write email + submit Admin->>Express: POST /api/responses (create) Express->>DB: Insert response Express->>Queue: Enqueue verification email Express->>DB: Insert campaign email record DB-->>Express: Response created Express-->>Admin: Success response Admin-->>User: Show success message Queue->>SMTP: Send verification email SMTP-->>Queue: Delivery confirmed User->>User: Click verification link (email) User->>Nginx: GET /verify-response/:token Nginx->>Admin: Route to React app Admin->>Express: POST /api/responses/:id/verify Express->>DB: Update response (verified=true) Express->>Queue: Enqueue campaign email to rep DB-->>Express: Response verified Express-->>Admin: Success Admin-->>User: Email sent confirmation Queue->>SMTP: Send campaign email to rep SMTP-->>Queue: Delivery confirmed ``` ## Technology Decisions ### Why TypeScript? - Type safety reduces runtime errors - Better IDE support and autocomplete - Easier refactoring - Self-documenting code ### Why Prisma + Drizzle? - **Prisma**: Great for complex models, migrations, auto-generated types - **Drizzle**: Lightweight, perfect for simple media tables - Evaluate both ORMs in production ### Why Dual API? - **Separation of concerns**: Media ops isolated from core features - **Performance**: Video processing doesn't block main API - **Scalability**: Independent horizontal scaling - **Technology evaluation**: Compare Express vs Fastify ### Why JWT over Sessions? - Stateless (scales horizontally) - No session storage overhead - Works across multiple API servers - Standard claims (iat, exp, sub) ### Why BullMQ over Bull? - Better TypeScript support - Improved performance - Active maintenance - Better documentation ### Why PostgreSQL over NoSQL? - Complex relational data (campaigns, locations, users) - ACID transactions (critical for email queue) - Full-text search - Spatial queries (PostGIS for future geo features) ## Deployment Architecture ### Docker Compose All services orchestrated in `docker-compose.yml`: **Profiles**: - `default`: Core services (postgres, redis, api, admin, nginx) - `monitoring`: Prometheus, Grafana, Alertmanager, exporters **Networks**: - `changemaker-lite` bridge network - Service discovery via container names **Volumes**: - PostgreSQL data persistence - Redis data persistence - Uploads directory - Logs directory [Learn more →](../deployment/docker-compose.md) ### Nginx Routing **Subdomain-based routing**: ```nginx # Admin GUI server { server_name app.cmlite.org; location / { proxy_pass http://admin:3000; } } # Express API server { server_name api.cmlite.org; location / { proxy_pass http://api:4000; } } # Fastify Media API server { server_name media.cmlite.org; location / { proxy_pass http://media-api:4100; } } ``` [Learn more →](networking.md) ## Security Architecture ### Authentication Flow ```mermaid sequenceDiagram participant Client participant API as Express API participant DB as PostgreSQL participant Redis Client->>API: POST /api/auth/login API->>DB: Verify credentials DB-->>API: User record API->>DB: Create refresh token (expires 7d) API->>Redis: Rate limit check API-->>Client: Access token (15min) + Refresh token (7d) Note over Client: Access token expires Client->>API: POST /api/auth/refresh API->>DB: Validate refresh token DB-->>API: Token valid API->>DB: Rotate refresh token (transaction) API-->>Client: New access token + New refresh token ``` **Features**: - bcrypt password hashing (12+ chars, complexity requirements) - JWT access tokens (15min expiry) - Refresh tokens (7 days, stored in DB, rotated on use) - Rate limiting (10 requests/min on auth endpoints) - User enumeration prevention (401 not 404) - RBAC middleware (requireRole, requireNonTemp) [Learn more →](authentication.md) ### Security Layers 1. **Network**: Nginx rate limiting, fail2ban 2. **Application**: Input validation (Zod schemas), RBAC 3. **Data**: Encrypted fields (ENCRYPTION_KEY), SQL injection prevention (Prisma) 4. **Transport**: HTTPS only (production), HSTS headers [Learn more →](security.md) ## Scalability Considerations ### Horizontal Scaling - **Stateless APIs**: JWT auth allows multiple API instances - **Redis-backed queues**: Share job queues across workers - **Database connection pooling**: Prisma manages connections - **Nginx load balancing**: Distribute requests across API instances ### Vertical Scaling - Increase container resources (CPU, memory) - Optimize database queries (indexes, query planning) - Redis memory limits (LRU eviction policy) ### Bottlenecks - **PostgreSQL**: Single primary (future: read replicas) - **Redis**: Single instance (future: Redis Cluster) - **File uploads**: Local disk (future: S3-compatible storage) ## Monitoring & Observability ### Golden Signals 1. **Latency**: Request duration histograms 2. **Traffic**: Request rate by endpoint 3. **Errors**: Error rate (5xx responses) 4. **Saturation**: Database connections, Redis memory, queue depth ### SLOs (Service Level Objectives) - **Availability**: 99.9% uptime (8.76 hours downtime/year) - **Latency**: p95 < 500ms, p99 < 1000ms - **Error Rate**: < 0.1% (1 error per 1000 requests) ### Alerting Strategy - **Critical**: Page on-call (service down, database unavailable) - **Warning**: Create ticket (queue growing, elevated errors) - **Info**: Log only (slow query, cache miss) [Learn more →](monitoring.md) ## Further Reading - [Dual API Architecture](dual-api.md) - Express + Fastify design - [Database Schema](database.md) - Complete ER diagram - [Authentication Flow](authentication.md) - JWT security model - [Frontend Architecture](frontend.md) - React + Vite + Ant Design - [Networking](networking.md) - Nginx routing and subdomains - [Security Model](security.md) - Comprehensive security audit - [Monitoring Stack](monitoring.md) - Prometheus + Grafana + Alertmanager - [Data Flow](data-flow.md) - Request lifecycle examples --- **Next**: [Set up your development environment →](../development/local-setup.md)