# Index Strategy & Performance ## Overview Changemaker Lite V2 uses strategic indexing across 33 models to optimize query performance. This document catalogs all indexes, explains their purpose, and provides query optimization guidance. **Total Indexes:** 60+ (Prisma: 50+, Drizzle: 10+) **Index Types:** - **Unique indexes** — Enforce uniqueness constraints (email, slug, token, etc.) - **Foreign key indexes** — Optimize JOIN operations (userId, campaignId, locationId, etc.) - **Composite indexes** — Multi-column indexes for complex queries - **Spatial indexes** — Latitude/longitude for geographic queries --- ## Index Catalog ### Auth & Users #### User - **Unique:** `email` — Login lookups (`WHERE email = ?`) #### RefreshToken - **Unique:** `token` — Refresh endpoint lookups (`WHERE token = ?`) - **Foreign Key:** `userId` — User deletion cascades --- ### Influence #### Campaign - **Unique:** `slug` — Public campaign page lookups (`WHERE slug = ?`) #### Representative - **Non-unique:** `postalCode` — Postal code lookups (`WHERE postalCode = ?`) #### CampaignEmail - **Foreign Key:** `campaignId` — Campaign email stats (`JOIN campaign_emails ON campaign_id = ?`) - **Non-unique:** `campaignSlug` — Slug-based queries #### RepresentativeResponse - **Foreign Key:** `campaignId` — Campaign response wall (`JOIN representative_responses ON campaign_id = ?`) - **Non-unique:** `campaignSlug` — Slug-based queries #### ResponseUpvote - **Unique:** `[responseId, userId]` — Prevent duplicate upvotes from logged-in users - **Unique:** `[responseId, upvotedIp]` — Prevent duplicate upvotes from same IP #### CustomRecipient - **Foreign Key:** `campaignId` — Campaign custom recipients (`JOIN custom_recipients ON campaign_id = ?`) #### PostalCodeCache - **Unique:** `postalCode` — Postal code cache lookups (`WHERE postal_code = ?`) #### Call - **Foreign Key:** `campaignId` — Campaign call tracking (`JOIN calls ON campaign_id = ?`) --- ### Map — Locations #### Location - **Unique:** `locGuid` — NAR location GUID lookups - **Composite:** `[latitude, longitude]` — **Spatial queries** (nearby locations, bounding box searches) - **Non-unique:** `postalCode` — Postal code filtering **Query Optimization:** ```sql -- Uses composite index for bounding box queries SELECT * FROM locations WHERE latitude BETWEEN ? AND ? AND longitude BETWEEN ? AND ?; ``` #### Address - **Unique:** `addrGuid` — NAR address GUID lookups - **Foreign Key:** `locationId` — Location addresses (`JOIN addresses ON location_id = ?`) - **Composite:** `[locationId, unitNumber]` — **Unit lookups within building** **Query Optimization:** ```sql -- Uses composite index for unit-specific queries SELECT * FROM addresses WHERE location_id = ? AND unit_number = ?; ``` #### LocationHistory - **Foreign Key:** `locationId` — Location history (`JOIN location_history ON location_id = ?`) - **Foreign Key:** `userId` — User edit history (`JOIN location_history ON user_id = ?`) - **Non-unique:** `createdAt` — **Temporal queries** (recent edits, audit trails) --- ### Map — Shifts & Cuts #### Shift - **Foreign Key:** `cutId` — Cut shifts (`JOIN shifts ON cut_id = ?`) #### ShiftSignup - **Unique:** `[shiftId, userEmail]` — **Prevent duplicate shift signups** - **Foreign Key:** `shiftId` — Shift signups (`JOIN shift_signups ON shift_id = ?`) --- ### Canvassing #### CanvassSession - **Foreign Key:** `userId` — User canvass sessions (`JOIN canvass_sessions ON user_id = ?`) - **Foreign Key:** `cutId` — Cut canvass sessions (`JOIN canvass_sessions ON cut_id = ?`) - **Foreign Key:** `shiftId` — Shift canvass sessions (`JOIN canvass_sessions ON shift_id = ?`) #### CanvassVisit - **Foreign Key:** `addressId` — Address visit history (`JOIN canvass_visits ON address_id = ?`) - **Foreign Key:** `userId` — User visit history (`JOIN canvass_visits ON user_id = ?`) - **Foreign Key:** `shiftId` — Shift visits (`JOIN canvass_visits ON shift_id = ?`) - **Foreign Key:** `sessionId` — Session visits (`JOIN canvass_visits ON session_id = ?`) - **Non-unique:** `visitedAt` — **Temporal queries** (recent visits, activity feeds) #### TrackingSession - **Unique:** `canvassSessionId` — **One-to-one relationship** with CanvassSession - **Foreign Key:** `userId` — User GPS sessions (`JOIN tracking_sessions ON user_id = ?`) - **Non-unique:** `isActive` — Active session filtering (`WHERE is_active = true`) - **Composite:** `[isActive, lastRecordedAt]` — **Session cleanup queries** (abandoned sessions) **Query Optimization:** ```sql -- Uses composite index for abandoned session cleanup SELECT * FROM tracking_sessions WHERE is_active = true AND last_recorded_at < NOW() - INTERVAL '12 hours'; ``` #### TrackPoint - **Composite:** `[trackingSessionId, recordedAt]` — **Temporal GPS queries** (session breadcrumb trail) - **Non-unique:** `recordedAt` — Cross-session temporal queries --- ### Email Templates #### EmailTemplate - **Unique:** `key` — Template key lookups (`WHERE key = 'campaign-email'`) - **Non-unique:** `category` — Category filtering (`WHERE category = 'INFLUENCE'`) - **Non-unique:** `isActive` — Active template filtering (`WHERE is_active = true`) #### EmailTemplateVariable - **Unique:** `[templateId, key]` — **Unique variable keys per template** - **Foreign Key:** `templateId` — Template variables (`JOIN email_template_variables ON template_id = ?`) #### EmailTemplateVersion - **Unique:** `[templateId, versionNumber]` — **Sequential version numbers per template** - **Composite:** `[templateId, createdAt(sort: Desc)]` — **Recent version history** **Query Optimization:** ```sql -- Uses composite index for recent version queries SELECT * FROM email_template_versions WHERE template_id = ? ORDER BY created_at DESC LIMIT 10; ``` #### EmailTemplateTestLog - **Composite:** `[templateId, sentAt(sort: Desc)]` — **Recent test logs** --- ### Landing Pages #### LandingPage - **Unique:** `slug` — Public page lookups (`WHERE slug = 'about'`) --- ### Media (Drizzle ORM) #### videos - **Unique:** `path` — File path lookups (`WHERE path = '/media/local/videos/file.mp4'`) - **Non-unique:** `orientation` — Orientation filtering (`WHERE orientation = 'landscape'`) - **Non-unique:** `producer` — Producer filtering (`WHERE producer = 'Studio A'`) - **Non-unique:** `isValid` — Valid video filtering (`WHERE is_valid = true`) - **Non-unique:** `directoryType` — Directory type filtering (`WHERE directory_type = 'studios'`) - **Composite:** `[durationSeconds, fileSize, width, height]` — **Fingerprint matching** (duplicate detection) - **Composite:** `[directoryType, isValid, orientation]` — **Common filtering pattern** **Query Optimization:** ```sql -- Uses composite index for common video library queries SELECT * FROM videos WHERE directory_type = 'studios' AND is_valid = true AND orientation = 'landscape'; ``` #### jobs - **Composite:** `[status, priority, createdAt]` — **Job queue processing** - **Composite:** `[resourceCategory, status]` — **Resource-based filtering** - **Non-unique:** `pipelineId` — Pipeline job filtering **Query Optimization:** ```sql -- Uses composite index for job queue queries SELECT * FROM jobs WHERE status = 'pending' ORDER BY priority ASC, created_at ASC LIMIT 10; ``` --- ## Query Optimization Patterns ### 1. Use Indexes for WHERE Clauses ```typescript // ✅ Uses email unique index await prisma.user.findUnique({ where: { email: 'user@example.com' } }); // ❌ Full table scan (no index on name) await prisma.user.findMany({ where: { name: 'John' } }); ``` ### 2. Use Composite Indexes for Multi-Column Filters ```typescript // ✅ Uses [latitude, longitude] composite index await prisma.location.findMany({ where: { latitude: { gte: 53.5, lte: 53.6 }, longitude: { gte: -113.5, lte: -113.4 }, }, }); // ❌ Less efficient (only uses latitude index) await prisma.location.findMany({ where: { latitude: { gte: 53.5, lte: 53.6 }, // longitude filter applied after index scan }, }); ``` ### 3. Use Foreign Key Indexes for JOINs ```typescript // ✅ Uses campaignId foreign key index await prisma.campaign.findUnique({ where: { id: campaignId }, include: { emails: true }, // JOIN uses index }); // ❌ N+1 query (loads emails one-by-one) const campaign = await prisma.campaign.findUnique({ where: { id: campaignId } }); const emails = await prisma.campaignEmail.findMany({ where: { campaignId: campaign.id } }); ``` ### 4. Use Unique Indexes for Deduplication ```typescript // ✅ Uses [responseId, userId] unique index await prisma.responseUpvote.create({ data: { responseId, userId, upvotedIp }, }); // Throws error if user already upvoted (database-level check) // ❌ Application-level check (race condition) const existing = await prisma.responseUpvote.findFirst({ where: { responseId, userId }, }); if (existing) throw new Error('Already upvoted'); await prisma.responseUpvote.create({ data: { responseId, userId } }); ``` ### 5. Use Temporal Indexes for Date Filtering ```typescript // ✅ Uses createdAt index await prisma.locationHistory.findMany({ where: { createdAt: { gte: new Date('2025-01-01') }, }, orderBy: { createdAt: 'desc' }, take: 100, }); // ❌ Full table scan (no index on field) await prisma.locationHistory.findMany({ where: { oldValue: { contains: 'Calgary' }, // No index }, }); ``` --- ## Index Selectivity **Selectivity** = Percentage of unique values in indexed column. Higher selectivity = better index performance. ### High Selectivity (Good) - **email** (User) — 100% unique (1 user per email) - **token** (RefreshToken) — 100% unique (1 token per record) - **slug** (Campaign, LandingPage) — 100% unique (1 record per slug) - **[responseId, userId]** (ResponseUpvote) — High uniqueness (1 upvote per user per response) ### Medium Selectivity (Okay) - **postalCode** (Location) — ~50% unique (multiple locations per postal code) - **campaignId** (CampaignEmail) — ~10% unique (100s of emails per campaign) - **directoryType** (videos) — ~11% unique (9 directory types) ### Low Selectivity (Poor for filtering, good for covering index) - **isActive** (TrackingSession) — ~50% unique (active vs inactive) - **status** (Campaign) — ~25% unique (4 statuses: DRAFT, ACTIVE, PAUSED, ARCHIVED) - **role** (User) — ~20% unique (5 roles) **Optimization:** - Use low-selectivity indexes as **first column in composite index** only - Example: `[isActive, lastRecordedAt]` uses `isActive` to narrow search, then `lastRecordedAt` for ordering --- ## Index Maintenance ### Prisma Indexes (Automatic) Prisma migrations automatically create indexes defined in `schema.prisma`: ```prisma model Location { latitude Decimal longitude Decimal @@index([latitude, longitude]) // Composite index } ``` ### Drizzle Indexes (Manual in Schema) Drizzle indexes defined in `schema.ts`: ```typescript export const videos = pgTable('videos', { directoryType: text('directory_type'), isValid: boolean('is_valid'), orientation: text('orientation'), }, (table) => ({ directoryValidOrientationIdx: index('idx_videos_directory_valid_orientation') .on(table.directoryType, table.isValid, table.orientation), })); ``` ### Index Size Monitoring ```sql -- Check index sizes SELECT tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size FROM pg_stat_user_indexes WHERE schemaname = 'public' ORDER BY pg_relation_size(indexrelid) DESC; ``` ### Unused Index Detection ```sql -- Find indexes with zero scans (unused) SELECT schemaname, tablename, indexname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size FROM pg_stat_user_indexes WHERE schemaname = 'public' AND idx_scan = 0 AND indexrelid NOT IN ( SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u') ) ORDER BY pg_relation_size(indexrelid) DESC; ``` --- ## Performance Considerations ### Index Trade-offs - **Pros:** Faster SELECT queries, enforces uniqueness, prevents N+1 - **Cons:** Slower INSERT/UPDATE/DELETE (index must be updated), increased storage **Rule of Thumb:** - Index all foreign keys (JOIN performance) - Index all unique constraints (data integrity) - Index columns used in WHERE clauses frequently - Avoid indexing low-selectivity columns alone - Avoid indexing large text fields (use full-text search instead) ### Query Planning Use `EXPLAIN ANALYZE` to verify index usage: ```sql EXPLAIN ANALYZE SELECT * FROM locations WHERE latitude BETWEEN 53.5 AND 53.6 AND longitude BETWEEN -113.5 AND -113.4; -- Output should show "Index Scan using locations_latitude_longitude_idx" ``` ### Index Bloat Over time, indexes can become bloated (unused space). Monitor with: ```sql SELECT schemaname, tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS index_size, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes WHERE schemaname = 'public' ORDER BY pg_relation_size(indexrelid) DESC; ``` **Fix bloat:** `REINDEX INDEX index_name;` (requires table lock) --- ## Common Performance Issues ### Issue: Slow campaign email stats query **Query:** ```sql SELECT COUNT(*) FROM campaign_emails WHERE campaign_id = ?; ``` **Solution:** Already optimized (uses `campaignId` foreign key index) ### Issue: Slow location bounding box queries **Query:** ```sql SELECT * FROM locations WHERE latitude > ? AND latitude < ? AND longitude > ? AND longitude < ?; ``` **Solution:** Already optimized (uses `[latitude, longitude]` composite index) ### Issue: Slow active session cleanup **Query:** ```sql SELECT * FROM tracking_sessions WHERE is_active = true AND last_recorded_at < ?; ``` **Solution:** Already optimized (uses `[isActive, lastRecordedAt]` composite index) ### Issue: Slow template version history **Query:** ```sql SELECT * FROM email_template_versions WHERE template_id = ? ORDER BY created_at DESC LIMIT 10; ``` **Solution:** Already optimized (uses `[templateId, createdAt(sort: Desc)]` composite index) --- ## Related Documentation - [Database Overview](./index.md) — Complete ER diagram - [Schema Reference](./schema.md) — All model fields - [Migration Workflow](./migrations.md) — Creating indexes in migrations - [Common Queries](./models/auth.md#common-queries) — Query examples with index usage - [PostgreSQL Index Documentation](https://www.postgresql.org/docs/16/indexes.html)