463 lines
14 KiB
Markdown

# Index Strategy & Performance
## Overview
Changemaker Lite V2 uses strategic indexing across 33 models to optimize query performance. This document catalogs all indexes, explains their purpose, and provides query optimization guidance.
**Total Indexes:** 60+ (Prisma: 50+, Drizzle: 10+)
**Index Types:**
- **Unique indexes** — Enforce uniqueness constraints (email, slug, token, etc.)
- **Foreign key indexes** — Optimize JOIN operations (userId, campaignId, locationId, etc.)
- **Composite indexes** — Multi-column indexes for complex queries
- **Spatial indexes** — Latitude/longitude for geographic queries
---
## Index Catalog
### Auth & Users
#### User
- **Unique:** `email` — Login lookups (`WHERE email = ?`)
#### RefreshToken
- **Unique:** `token` — Refresh endpoint lookups (`WHERE token = ?`)
- **Foreign Key:** `userId` — User deletion cascades
---
### Influence
#### Campaign
- **Unique:** `slug` — Public campaign page lookups (`WHERE slug = ?`)
#### Representative
- **Non-unique:** `postalCode` — Postal code lookups (`WHERE postalCode = ?`)
#### CampaignEmail
- **Foreign Key:** `campaignId` — Campaign email stats (`JOIN campaign_emails ON campaign_id = ?`)
- **Non-unique:** `campaignSlug` — Slug-based queries
#### RepresentativeResponse
- **Foreign Key:** `campaignId` — Campaign response wall (`JOIN representative_responses ON campaign_id = ?`)
- **Non-unique:** `campaignSlug` — Slug-based queries
#### ResponseUpvote
- **Unique:** `[responseId, userId]` — Prevent duplicate upvotes from logged-in users
- **Unique:** `[responseId, upvotedIp]` — Prevent duplicate upvotes from same IP
#### CustomRecipient
- **Foreign Key:** `campaignId` — Campaign custom recipients (`JOIN custom_recipients ON campaign_id = ?`)
#### PostalCodeCache
- **Unique:** `postalCode` — Postal code cache lookups (`WHERE postal_code = ?`)
#### Call
- **Foreign Key:** `campaignId` — Campaign call tracking (`JOIN calls ON campaign_id = ?`)
---
### Map — Locations
#### Location
- **Unique:** `locGuid` — NAR location GUID lookups
- **Composite:** `[latitude, longitude]`**Spatial queries** (nearby locations, bounding box searches)
- **Non-unique:** `postalCode` — Postal code filtering
**Query Optimization:**
```sql
-- Uses composite index for bounding box queries
SELECT * FROM locations
WHERE latitude BETWEEN ? AND ?
AND longitude BETWEEN ? AND ?;
```
#### Address
- **Unique:** `addrGuid` — NAR address GUID lookups
- **Foreign Key:** `locationId` — Location addresses (`JOIN addresses ON location_id = ?`)
- **Composite:** `[locationId, unitNumber]`**Unit lookups within building**
**Query Optimization:**
```sql
-- Uses composite index for unit-specific queries
SELECT * FROM addresses
WHERE location_id = ? AND unit_number = ?;
```
#### LocationHistory
- **Foreign Key:** `locationId` — Location history (`JOIN location_history ON location_id = ?`)
- **Foreign Key:** `userId` — User edit history (`JOIN location_history ON user_id = ?`)
- **Non-unique:** `createdAt`**Temporal queries** (recent edits, audit trails)
---
### Map — Shifts & Cuts
#### Shift
- **Foreign Key:** `cutId` — Cut shifts (`JOIN shifts ON cut_id = ?`)
#### ShiftSignup
- **Unique:** `[shiftId, userEmail]`**Prevent duplicate shift signups**
- **Foreign Key:** `shiftId` — Shift signups (`JOIN shift_signups ON shift_id = ?`)
---
### Canvassing
#### CanvassSession
- **Foreign Key:** `userId` — User canvass sessions (`JOIN canvass_sessions ON user_id = ?`)
- **Foreign Key:** `cutId` — Cut canvass sessions (`JOIN canvass_sessions ON cut_id = ?`)
- **Foreign Key:** `shiftId` — Shift canvass sessions (`JOIN canvass_sessions ON shift_id = ?`)
#### CanvassVisit
- **Foreign Key:** `addressId` — Address visit history (`JOIN canvass_visits ON address_id = ?`)
- **Foreign Key:** `userId` — User visit history (`JOIN canvass_visits ON user_id = ?`)
- **Foreign Key:** `shiftId` — Shift visits (`JOIN canvass_visits ON shift_id = ?`)
- **Foreign Key:** `sessionId` — Session visits (`JOIN canvass_visits ON session_id = ?`)
- **Non-unique:** `visitedAt`**Temporal queries** (recent visits, activity feeds)
#### TrackingSession
- **Unique:** `canvassSessionId`**One-to-one relationship** with CanvassSession
- **Foreign Key:** `userId` — User GPS sessions (`JOIN tracking_sessions ON user_id = ?`)
- **Non-unique:** `isActive` — Active session filtering (`WHERE is_active = true`)
- **Composite:** `[isActive, lastRecordedAt]`**Session cleanup queries** (abandoned sessions)
**Query Optimization:**
```sql
-- Uses composite index for abandoned session cleanup
SELECT * FROM tracking_sessions
WHERE is_active = true
AND last_recorded_at < NOW() - INTERVAL '12 hours';
```
#### TrackPoint
- **Composite:** `[trackingSessionId, recordedAt]`**Temporal GPS queries** (session breadcrumb trail)
- **Non-unique:** `recordedAt` — Cross-session temporal queries
---
### Email Templates
#### EmailTemplate
- **Unique:** `key` — Template key lookups (`WHERE key = 'campaign-email'`)
- **Non-unique:** `category` — Category filtering (`WHERE category = 'INFLUENCE'`)
- **Non-unique:** `isActive` — Active template filtering (`WHERE is_active = true`)
#### EmailTemplateVariable
- **Unique:** `[templateId, key]`**Unique variable keys per template**
- **Foreign Key:** `templateId` — Template variables (`JOIN email_template_variables ON template_id = ?`)
#### EmailTemplateVersion
- **Unique:** `[templateId, versionNumber]`**Sequential version numbers per template**
- **Composite:** `[templateId, createdAt(sort: Desc)]`**Recent version history**
**Query Optimization:**
```sql
-- Uses composite index for recent version queries
SELECT * FROM email_template_versions
WHERE template_id = ?
ORDER BY created_at DESC
LIMIT 10;
```
#### EmailTemplateTestLog
- **Composite:** `[templateId, sentAt(sort: Desc)]`**Recent test logs**
---
### Landing Pages
#### LandingPage
- **Unique:** `slug` — Public page lookups (`WHERE slug = 'about'`)
---
### Media (Drizzle ORM)
#### videos
- **Unique:** `path` — File path lookups (`WHERE path = '/media/local/videos/file.mp4'`)
- **Non-unique:** `orientation` — Orientation filtering (`WHERE orientation = 'landscape'`)
- **Non-unique:** `producer` — Producer filtering (`WHERE producer = 'Studio A'`)
- **Non-unique:** `isValid` — Valid video filtering (`WHERE is_valid = true`)
- **Non-unique:** `directoryType` — Directory type filtering (`WHERE directory_type = 'studios'`)
- **Composite:** `[durationSeconds, fileSize, width, height]`**Fingerprint matching** (duplicate detection)
- **Composite:** `[directoryType, isValid, orientation]`**Common filtering pattern**
**Query Optimization:**
```sql
-- Uses composite index for common video library queries
SELECT * FROM videos
WHERE directory_type = 'studios'
AND is_valid = true
AND orientation = 'landscape';
```
#### jobs
- **Composite:** `[status, priority, createdAt]`**Job queue processing**
- **Composite:** `[resourceCategory, status]`**Resource-based filtering**
- **Non-unique:** `pipelineId` — Pipeline job filtering
**Query Optimization:**
```sql
-- Uses composite index for job queue queries
SELECT * FROM jobs
WHERE status = 'pending'
ORDER BY priority ASC, created_at ASC
LIMIT 10;
```
---
## Query Optimization Patterns
### 1. Use Indexes for WHERE Clauses
```typescript
// ✅ Uses email unique index
await prisma.user.findUnique({ where: { email: 'user@example.com' } });
// ❌ Full table scan (no index on name)
await prisma.user.findMany({ where: { name: 'John' } });
```
### 2. Use Composite Indexes for Multi-Column Filters
```typescript
// ✅ Uses [latitude, longitude] composite index
await prisma.location.findMany({
where: {
latitude: { gte: 53.5, lte: 53.6 },
longitude: { gte: -113.5, lte: -113.4 },
},
});
// ❌ Less efficient (only uses latitude index)
await prisma.location.findMany({
where: {
latitude: { gte: 53.5, lte: 53.6 },
// longitude filter applied after index scan
},
});
```
### 3. Use Foreign Key Indexes for JOINs
```typescript
// ✅ Uses campaignId foreign key index
await prisma.campaign.findUnique({
where: { id: campaignId },
include: { emails: true }, // JOIN uses index
});
// ❌ N+1 query (loads emails one-by-one)
const campaign = await prisma.campaign.findUnique({ where: { id: campaignId } });
const emails = await prisma.campaignEmail.findMany({ where: { campaignId: campaign.id } });
```
### 4. Use Unique Indexes for Deduplication
```typescript
// ✅ Uses [responseId, userId] unique index
await prisma.responseUpvote.create({
data: { responseId, userId, upvotedIp },
});
// Throws error if user already upvoted (database-level check)
// ❌ Application-level check (race condition)
const existing = await prisma.responseUpvote.findFirst({
where: { responseId, userId },
});
if (existing) throw new Error('Already upvoted');
await prisma.responseUpvote.create({ data: { responseId, userId } });
```
### 5. Use Temporal Indexes for Date Filtering
```typescript
// ✅ Uses createdAt index
await prisma.locationHistory.findMany({
where: {
createdAt: { gte: new Date('2025-01-01') },
},
orderBy: { createdAt: 'desc' },
take: 100,
});
// ❌ Full table scan (no index on field)
await prisma.locationHistory.findMany({
where: {
oldValue: { contains: 'Calgary' }, // No index
},
});
```
---
## Index Selectivity
**Selectivity** = Percentage of unique values in indexed column. Higher selectivity = better index performance.
### High Selectivity (Good)
- **email** (User) — 100% unique (1 user per email)
- **token** (RefreshToken) — 100% unique (1 token per record)
- **slug** (Campaign, LandingPage) — 100% unique (1 record per slug)
- **[responseId, userId]** (ResponseUpvote) — High uniqueness (1 upvote per user per response)
### Medium Selectivity (Okay)
- **postalCode** (Location) — ~50% unique (multiple locations per postal code)
- **campaignId** (CampaignEmail) — ~10% unique (100s of emails per campaign)
- **directoryType** (videos) — ~11% unique (9 directory types)
### Low Selectivity (Poor for filtering, good for covering index)
- **isActive** (TrackingSession) — ~50% unique (active vs inactive)
- **status** (Campaign) — ~25% unique (4 statuses: DRAFT, ACTIVE, PAUSED, ARCHIVED)
- **role** (User) — ~20% unique (5 roles)
**Optimization:**
- Use low-selectivity indexes as **first column in composite index** only
- Example: `[isActive, lastRecordedAt]` uses `isActive` to narrow search, then `lastRecordedAt` for ordering
---
## Index Maintenance
### Prisma Indexes (Automatic)
Prisma migrations automatically create indexes defined in `schema.prisma`:
```prisma
model Location {
latitude Decimal
longitude Decimal
@@index([latitude, longitude]) // Composite index
}
```
### Drizzle Indexes (Manual in Schema)
Drizzle indexes defined in `schema.ts`:
```typescript
export const videos = pgTable('videos', {
directoryType: text('directory_type'),
isValid: boolean('is_valid'),
orientation: text('orientation'),
}, (table) => ({
directoryValidOrientationIdx: index('idx_videos_directory_valid_orientation')
.on(table.directoryType, table.isValid, table.orientation),
}));
```
### Index Size Monitoring
```sql
-- Check index sizes
SELECT
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
```
### Unused Index Detection
```sql
-- Find indexes with zero scans (unused)
SELECT
schemaname,
tablename,
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
AND idx_scan = 0
AND indexrelid NOT IN (
SELECT conindid FROM pg_constraint WHERE contype IN ('p', 'u')
)
ORDER BY pg_relation_size(indexrelid) DESC;
```
---
## Performance Considerations
### Index Trade-offs
- **Pros:** Faster SELECT queries, enforces uniqueness, prevents N+1
- **Cons:** Slower INSERT/UPDATE/DELETE (index must be updated), increased storage
**Rule of Thumb:**
- Index all foreign keys (JOIN performance)
- Index all unique constraints (data integrity)
- Index columns used in WHERE clauses frequently
- Avoid indexing low-selectivity columns alone
- Avoid indexing large text fields (use full-text search instead)
### Query Planning
Use `EXPLAIN ANALYZE` to verify index usage:
```sql
EXPLAIN ANALYZE
SELECT * FROM locations
WHERE latitude BETWEEN 53.5 AND 53.6
AND longitude BETWEEN -113.5 AND -113.4;
-- Output should show "Index Scan using locations_latitude_longitude_idx"
```
### Index Bloat
Over time, indexes can become bloated (unused space). Monitor with:
```sql
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan,
idx_tup_read,
idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
```
**Fix bloat:** `REINDEX INDEX index_name;` (requires table lock)
---
## Common Performance Issues
### Issue: Slow campaign email stats query
**Query:**
```sql
SELECT COUNT(*) FROM campaign_emails WHERE campaign_id = ?;
```
**Solution:** Already optimized (uses `campaignId` foreign key index)
### Issue: Slow location bounding box queries
**Query:**
```sql
SELECT * FROM locations WHERE latitude > ? AND latitude < ? AND longitude > ? AND longitude < ?;
```
**Solution:** Already optimized (uses `[latitude, longitude]` composite index)
### Issue: Slow active session cleanup
**Query:**
```sql
SELECT * FROM tracking_sessions WHERE is_active = true AND last_recorded_at < ?;
```
**Solution:** Already optimized (uses `[isActive, lastRecordedAt]` composite index)
### Issue: Slow template version history
**Query:**
```sql
SELECT * FROM email_template_versions WHERE template_id = ? ORDER BY created_at DESC LIMIT 10;
```
**Solution:** Already optimized (uses `[templateId, createdAt(sort: Desc)]` composite index)
---
## Related Documentation
- [Database Overview](./index.md) — Complete ER diagram
- [Schema Reference](./schema.md) — All model fields
- [Migration Workflow](./migrations.md) — Creating indexes in migrations
- [Common Queries](./models/auth.md#common-queries) — Query examples with index usage
- [PostgreSQL Index Documentation](https://www.postgresql.org/docs/16/indexes.html)