# NAR Import System ## Overview The National Address Register (NAR) import system enables bulk import of Canadian electoral data from Elections Canada. The system supports the 2025 NAR format with server-side streaming import, coordinate projection conversion, and comprehensive filtering options. **Key Features:** - Server-side streaming import (handles large datasets) - NAR 2025 format support (BG_X/BG_Y Lambert projection) - Address + Location file joining on LOC_GUID - Proj4 coordinate conversion (EPSG:3347 → WGS84) - Province selector (13 provinces/territories) - Filtering: city, postal code, cut boundary, residential-only - Multi-part file handling (large provinces) - Progress tracking and error reporting - Import statistics and validation **Use Cases:** - Initial campaign database setup - Electoral district targeting - NAR data updates (new redistribution) - Multi-region campaign expansion - Address database verification **Architecture Highlights:** - Streaming CSV parser (avoids memory limits) - File-based LOC_GUID join - Real-time coordinate projection - Point-in-polygon cut filtering - Transaction batching (500 records/commit) - Duplicate prevention via UPSERT ## Architecture ```mermaid flowchart TB subgraph Admin Interface Admin[Admin User] LocationsPage[LocationsPage - NAR Tab] end subgraph API Layer DatasetsAPI["/api/locations/nar/datasets"] ImportAPI["/api/locations/nar/import"] end subgraph NAR Import Service Scanner[File Scanner] Reader[CSV Stream Reader] Joiner[Address+Location Joiner] Converter[Coordinate Converter] Filter[Filter Pipeline] Importer[Bulk Importer] end subgraph File System DataDir[/data/NAR Files] AddressFiles[Address_XX_part_*.csv] LocationFiles[Location_XX.csv] end subgraph Database LocationsDB[(Locations)] AddressesDB[(Addresses)] end subgraph External Services Proj4[Proj4 Library] EPSG3347[EPSG:3347 Definition] end Admin --> LocationsPage LocationsPage --> DatasetsAPI LocationsPage --> ImportAPI DatasetsAPI --> Scanner Scanner --> DataDir ImportAPI --> Reader Reader --> AddressFiles Reader --> LocationFiles Reader --> Joiner Joiner --> Converter Converter --> Proj4 Proj4 --> EPSG3347 Converter --> Filter Filter --> Importer Importer --> LocationsDB Importer --> AddressesDB ``` **Data Flow:** 1. **Dataset Discovery:** - Scan /data directory for NAR CSV files - Group by province code (10-62) - Identify multi-part Address files - Return available datasets 2. **Import Initiation:** - Admin selects province + filters - API creates import job - Begins streaming CSV files 3. **File Processing:** - Read Address files (all parts sequentially) - Read Location file (parallel) - Join on LOC_GUID (in-memory map) 4. **Coordinate Conversion:** - Extract BG_X/BG_Y from Location file - Convert EPSG:3347 → WGS84 using Proj4 - Fallback to BG_LATITUDE/BG_LONGITUDE if conversion fails 5. **Filtering:** - City filter (exact match on MUNICIPALITY) - Postal code filter (prefix match) - Cut filter (point-in-polygon) - Residential filter (BU_USE = 1) 6. **Database Import:** - UPSERT Locations by locGuid (prevent duplicates) - INSERT Addresses with foreign key - Batch commits (500 records) - Track progress and errors ## NAR File Format ### File Structure **Directory Layout:** ``` /data/ ├── Address_10.csv # Newfoundland ├── Address_11.csv # PEI ├── Address_12.csv # Nova Scotia ├── Address_13.csv # New Brunswick ├── Address_24_part_1.csv # Quebec (multi-part) ├── Address_24_part_2.csv ├── Address_24_part_3.csv ├── Address_24_part_4.csv ├── Address_24_part_5.csv ├── Address_24_part_6.csv ├── Address_35_part_1.csv # Ontario (multi-part) ├── Address_35_part_2.csv ├── ... ├── Location_10.csv ├── Location_11.csv ├── Location_12.csv ├── Location_13.csv ├── Location_24.csv ├── Location_35.csv └── ... ``` ### Address File Schema **File: Address_XX_part_Y.csv** ```csv ADDR_GUID,LOC_GUID,CIVIC_NO,OFFICIAL_STREET_NAME,POSTAL_CODE,MUNICIPALITY,PROVINCE_CODE {uuid},{uuid},123,MAIN ST,M5H2N2,TORONTO,35 {uuid},{uuid},125,MAIN ST,M5H2N2,TORONTO,35 {uuid},{uuid},127,MAIN ST,M5H2N2,TORONTO,35 ``` **Key Fields:** | Field | Type | Description | Example | |-------|------|-------------|---------| | ADDR_GUID | UUID | Unique address identifier | `{12345678-...}` | | LOC_GUID | UUID | Location identifier (FK) | `{87654321-...}` | | CIVIC_NO | String | Street number | `123`, `123A`, `123-125` | | OFFICIAL_STREET_NAME | String | Street name (uppercase) | `MAIN ST`, `YONGE ST` | | POSTAL_CODE | String | Canadian postal code (no space) | `M5H2N2`, `K1A0B1` | | MUNICIPALITY | String | City/town name | `TORONTO`, `OTTAWA` | | PROVINCE_CODE | Integer | Province code (10-62) | `35` (Ontario) | **Record Count:** - Small provinces: 10k-50k addresses - Medium provinces: 50k-200k addresses - Large provinces: 200k-1M+ addresses (multi-part files) ### Location File Schema **File: Location_XX.csv** ```csv LOC_GUID,BG_LATITUDE,BG_LONGITUDE,BG_X,BG_Y,FED_NUM,BU_USE,MUNICIPALITY {uuid},43.6532,-79.3832,1234567.89,234567.89,35001,1,TORONTO {uuid},43.6540,-79.3825,1234600.00,234600.00,35001,1,TORONTO ``` **Key Fields:** | Field | Type | Description | Example | |-------|------|-------------|---------| | LOC_GUID | UUID | Unique location identifier | `{87654321-...}` | | BG_LATITUDE | Float | Latitude (WGS84) | `43.6532` | | BG_LONGITUDE | Float | Longitude (WGS84) | `-79.3832` | | BG_X | Float | X coord (EPSG:3347 Lambert) | `1234567.89` | | BG_Y | Float | Y coord (EPSG:3347 Lambert) | `234567.89` | | FED_NUM | String | Federal electoral district | `35001`, `24050` | | BU_USE | Integer | Building use code | `1` = Residential | | MUNICIPALITY | String | City/town name | `TORONTO` | **Coordinate Systems:** - **BG_LATITUDE/BG_LONGITUDE:** WGS84 decimal degrees (EPSG:4326) - **BG_X/BG_Y:** Statistics Canada Lambert Conformal Conic (EPSG:3347) - **2025 NAR Change:** Primary coordinates shifted from lat/lng to BG_X/BG_Y **Building Use Codes:** | Code | Description | |------|-------------| | 1 | Residential | | 2 | Commercial | | 3 | Industrial | | 4 | Institutional | | 5 | Parks/Recreation | | 9 | Other | ## Database Models ### Location Model Extensions ```prisma model Location { id Int @id @default(autoincrement()) address String latitude Float? longitude Float? postalCode String? province String? // NAR-specific fields locGuid String? @unique // NAR LOC_GUID (UUID) federalDistrict String? // NAR FED_NUM buildingUse Int? // NAR BU_USE code municipality String? // NAR MUNICIPALITY // Geocoding metadata (populated during import) geocodeConfidence Int? @default(100) // NAR = high confidence geocodeProvider String? @default("NAR") geocodedAt DateTime? addresses Address[] createdAt DateTime @default(now()) updatedAt DateTime @updatedAt @@index([locGuid]) @@index([federalDistrict]) @@index([buildingUse]) @@index([postalCode]) } ``` ### Address Model Extensions ```prisma model Address { id Int @id @default(autoincrement()) locationId Int location Location @relation(fields: [locationId], references: [id], onDelete: Cascade) // NAR-specific fields addrGuid String? @unique // NAR ADDR_GUID (UUID) unitNumber String? // NAR CIVIC_NO (if multi-unit) // Voter data (future) firstName String? lastName String? supportLevel Int? createdAt DateTime @default(now()) updatedAt DateTime @updatedAt @@index([locationId]) @@index([addrGuid]) } ``` **UPSERT Strategy:** ```typescript // Prevent duplicates on re-import const location = await prisma.location.upsert({ where: { locGuid: narRecord.LOC_GUID }, update: { address: narRecord.addressString, latitude: coords.latitude, longitude: coords.longitude, postalCode: narRecord.POSTAL_CODE, province: provinceMap[narRecord.PROVINCE_CODE], federalDistrict: narRecord.FED_NUM, buildingUse: narRecord.BU_USE, municipality: narRecord.MUNICIPALITY, geocodeProvider: 'NAR', geocodedAt: new Date() }, create: { locGuid: narRecord.LOC_GUID, address: narRecord.addressString, latitude: coords.latitude, longitude: coords.longitude, postalCode: narRecord.POSTAL_CODE, province: provinceMap[narRecord.PROVINCE_CODE], federalDistrict: narRecord.FED_NUM, buildingUse: narRecord.BU_USE, municipality: narRecord.MUNICIPALITY, geocodeConfidence: 100, geocodeProvider: 'NAR', geocodedAt: new Date() } }); ``` ## API Endpoints ### GET /api/locations/nar/datasets Scan NAR data directory and return available province datasets. **Authentication:** Required (SUPER_ADMIN, MAP_ADMIN) **Response:** ```json { "datasets": [ { "provinceCode": "10", "provinceName": "Newfoundland and Labrador", "addressFiles": ["Address_10.csv"], "locationFile": "Location_10.csv", "addressFileCount": 1, "estimatedRecords": 15000, "lastModified": "2025-01-15T00:00:00Z" }, { "provinceCode": "24", "provinceName": "Quebec", "addressFiles": [ "Address_24_part_1.csv", "Address_24_part_2.csv", "Address_24_part_3.csv", "Address_24_part_4.csv", "Address_24_part_5.csv", "Address_24_part_6.csv" ], "locationFile": "Location_24.csv", "addressFileCount": 6, "estimatedRecords": 850000, "lastModified": "2025-01-20T00:00:00Z" }, { "provinceCode": "35", "provinceName": "Ontario", "addressFiles": [ "Address_35_part_1.csv", "Address_35_part_2.csv", "Address_35_part_3.csv" ], "locationFile": "Location_35.csv", "addressFileCount": 3, "estimatedRecords": 1200000, "lastModified": "2025-01-22T00:00:00Z" } ], "dataDir": "/data", "totalDatasets": 13 } ``` **Implementation:** ```typescript // nar-import.service.ts async scanDatasets(): Promise { const files = await fs.readdir(NAR_DATA_DIR); // Group files by province code const provinceGroups: Record = {}; files.forEach(file => { const addressMatch = file.match(/^Address_(\d+)(?:_part_\d+)?\.csv$/); const locationMatch = file.match(/^Location_(\d+)\.csv$/); if (addressMatch) { const code = addressMatch[1]; if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' }; provinceGroups[code].address.push(file); } else if (locationMatch) { const code = locationMatch[1]; if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' }; provinceGroups[code].location = file; } }); // Build dataset objects const datasets: NARDataset[] = []; for (const [code, group] of Object.entries(provinceGroups)) { if (group.address.length === 0 || !group.location) continue; const stats = await fs.stat(path.join(NAR_DATA_DIR, group.location)); datasets.push({ provinceCode: code, provinceName: PROVINCE_NAMES[code], addressFiles: group.address.sort(), locationFile: group.location, addressFileCount: group.address.length, estimatedRecords: await this.estimateRecordCount(group.address), lastModified: stats.mtime.toISOString() }); } return datasets.sort((a, b) => a.provinceCode.localeCompare(b.provinceCode)); } ``` ### POST /api/locations/nar/import Start NAR import job with filters. **Authentication:** Required (SUPER_ADMIN, MAP_ADMIN) **Request Body:** ```json { "provinceCode": "35", "city": "TORONTO", "postalCodePrefix": "M5", "cutId": 42, "residentialOnly": true } ``` **Parameters:** | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | provinceCode | string | Yes | Province code (10-62) | | city | string | No | Filter by MUNICIPALITY (exact match, uppercase) | | postalCodePrefix | string | No | Filter by postal code prefix (e.g., "M5", "K1A") | | cutId | number | No | Filter by cut boundary (point-in-polygon) | | residentialOnly | boolean | No | Only import BU_USE = 1 (default: false) | **Response:** ```json { "jobId": "nar-import-35-20250213-103000", "status": "processing", "provinceCode": "35", "provinceName": "Ontario", "filters": { "city": "TORONTO", "postalCodePrefix": "M5", "cutId": 42, "residentialOnly": true }, "startedAt": "2025-02-13T10:30:00Z", "estimatedCompletion": "2025-02-13T10:45:00Z" } ``` ### GET /api/locations/nar/import/:jobId Check import job progress. **Authentication:** Required (SUPER_ADMIN, MAP_ADMIN) **Response (In Progress):** ```json { "jobId": "nar-import-35-20250213-103000", "status": "processing", "progress": { "total": 1200000, "processed": 600000, "imported": 580000, "skipped": 15000, "errors": 5000, "percent": 50.0 }, "currentFile": "Address_35_part_2.csv", "startedAt": "2025-02-13T10:30:00Z", "estimatedCompletion": "2025-02-13T10:45:00Z" } ``` **Response (Complete):** ```json { "jobId": "nar-import-35-20250213-103000", "status": "completed", "result": { "total": 1200000, "processed": 1200000, "imported": 1150000, "skipped": 45000, "errors": 5000, "percent": 100.0 }, "statistics": { "locationsCreated": 800000, "locationsUpdated": 350000, "addressesCreated": 1150000, "avgConfidence": 100, "processingTime": "14m 32s" }, "startedAt": "2025-02-13T10:30:00Z", "completedAt": "2025-02-13T10:44:32Z" } ``` **Status Values:** - `queued`: Job created, waiting to start - `processing`: Import in progress - `completed`: Import finished successfully - `failed`: Import failed with errors - `cancelled`: Import cancelled by user ## Configuration ### Environment Variables | Variable | Type | Default | Description | |----------|------|---------|-------------| | NAR_DATA_DIR | string | /data | Directory containing NAR CSV files | | NAR_BATCH_SIZE | number | 500 | Records per database transaction | | NAR_IMPORT_TIMEOUT | number | 3600000 | Import timeout in ms (1 hour) | ### Province Codes Complete mapping of NAR province codes: ```typescript // nar-import.service.ts const PROVINCE_NAMES: Record = { '10': 'Newfoundland and Labrador', '11': 'Prince Edward Island', '12': 'Nova Scotia', '13': 'New Brunswick', '24': 'Quebec', '35': 'Ontario', '46': 'Manitoba', '47': 'Saskatchewan', '48': 'Alberta', '59': 'British Columbia', '60': 'Yukon', '61': 'Northwest Territories', '62': 'Nunavut' }; const PROVINCE_ABBREVIATIONS: Record = { '10': 'NL', '11': 'PE', '12': 'NS', '13': 'NB', '24': 'QC', '35': 'ON', '46': 'MB', '47': 'SK', '48': 'AB', '59': 'BC', '60': 'YT', '61': 'NT', '62': 'NU' }; ``` ### Coordinate Projection **EPSG:3347 Definition (Statistics Canada Lambert Conformal Conic):** ```typescript import proj4 from 'proj4'; // Define EPSG:3347 projection proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 +lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'); // Convert function const convertCoordinates = (bgX: number, bgY: number): [number, number] => { // Input: [X, Y] in EPSG:3347 (meters) // Output: [longitude, latitude] in WGS84 (degrees) return proj4('EPSG:3347', 'WGS84', [bgX, bgY]); }; ``` **Projection Parameters:** - **Type:** Lambert Conformal Conic - **Standard Parallels:** 49°N, 77°N - **Central Meridian:** -91.866667° - **Origin:** 63.390675°N, -91.866667°W - **False Easting:** 6,200,000 m - **False Northing:** 3,000,000 m - **Ellipsoid:** GRS80 - **Units:** Meters **Example Conversion:** ```typescript // Toronto City Hall coordinates const bgX = 609091.8; // EPSG:3347 X const bgY = 4834610.7; // EPSG:3347 Y const [lng, lat] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]); // Result: lng = -79.3832, lat = 43.6532 ``` ## Import Workflow ### Prepare NAR Files **Step 1: Download NAR Data** 1. Visit Elections Canada NAR portal: https://www.elections.ca/NAR 2. Select "2025 National Address Register" 3. Download province-specific CSV files 4. Extract ZIP archives **Step 2: Upload Files to Server** ```bash # Create data directory if not exists mkdir -p /path/to/data # Upload files via SCP scp Address_35_*.csv user@server:/path/to/data/ scp Location_35.csv user@server:/path/to/data/ # Or mount volume in Docker # docker-compose.yml: volumes: - ./data:/data:ro ``` **Step 3: Verify File Integrity** ```bash # Check file count ls -l /path/to/data/Address_35_*.csv | wc -l # Check Location file exists ls -l /path/to/data/Location_35.csv # Sample first few rows head -5 /path/to/data/Address_35_part_1.csv head -5 /path/to/data/Location_35.csv ``` ### Run Import via Admin UI **Step 1: Navigate to NAR Import Tab** 1. Log in as SUPER_ADMIN or MAP_ADMIN 2. Click **Map** → **Locations** in sidebar 3. Click **NAR Import** tab 4. Available datasets load automatically **Step 2: Select Province** ```plaintext ┌─────────────────────────────────────────┐ │ Available NAR Datasets │ ├─────────────────────────────────────────┤ │ Province │ Files │ Records │ ├──────────────────┼───────┼──────────────┤ │ Ontario (35) │ 3 │ 1,200,000 │ │ Quebec (24) │ 6 │ 850,000 │ │ Alberta (48) │ 2 │ 450,000 │ └──────────────────┴───────┴──────────────┘ [Select Province: Ontario ▼] ``` **Step 3: Configure Filters (Optional)** ```plaintext Filters (Optional): City: [TORONTO ] Filter by exact municipality name (uppercase) Postal Code Prefix: [M5 ] Filter by postal code prefix (2-3 chars) Cut Boundary: [Downtown Core ▼ ] Only import locations within cut polygon ☑ Residential Only Only import buildings with BU_USE = 1 ``` **Step 4: Review Import Summary** ```plaintext Import Summary: Province: Ontario (35) Files: Address_35_part_1.csv Address_35_part_2.csv Address_35_part_3.csv Location_35.csv Filters: City: TORONTO Postal Code: M5 Cut: Downtown Core Residential Only: Yes Estimated Records: ~50,000 (after filters) Estimated Time: ~3 minutes [Cancel] [Start Import] ``` **Step 5: Monitor Progress** ```plaintext Import in Progress... Current File: Address_35_part_2.csv Progress: 600,000 / 1,200,000 (50%) [████████████░░░░░░░░░░░░] 50% Statistics: Processed: 600,000 Imported: 580,000 Skipped: 15,000 Errors: 5,000 [Cancel Import] ``` **Step 6: Review Results** ```plaintext Import Complete! Final Statistics: Total Processed: 1,200,000 Successfully Imported: 1,150,000 Skipped (Filters): 45,000 Errors: 5,000 Details: Locations Created: 800,000 Locations Updated: 350,000 Addresses Created: 1,150,000 Processing Time: 14m 32s Avg Records/Second: 1,375 [View Import Log] [Import Another Province] [Close] ``` ### Import via API **Step 1: Get Available Datasets** ```bash curl -X GET http://localhost:4000/api/locations/nar/datasets \ -H "Authorization: Bearer $TOKEN" ``` **Step 2: Start Import** ```bash curl -X POST http://localhost:4000/api/locations/nar/import \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "provinceCode": "35", "city": "TORONTO", "postalCodePrefix": "M5", "residentialOnly": true }' ``` **Step 3: Poll Job Status** ```bash JOB_ID="nar-import-35-20250213-103000" while true; do STATUS=$(curl -s -X GET \ http://localhost:4000/api/locations/nar/import/$JOB_ID \ -H "Authorization: Bearer $TOKEN" \ | jq -r '.status') if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then break fi sleep 5 done # Get final result curl -X GET http://localhost:4000/api/locations/nar/import/$JOB_ID \ -H "Authorization: Bearer $TOKEN" | jq ``` ## Coordinate Conversion ### Proj4 Integration **Installation:** ```bash npm install proj4 # TypeScript types included in package ``` **Service Implementation:** ```typescript // nar-import.service.ts import proj4 from 'proj4'; // Define EPSG:3347 (Statistics Canada Lambert) proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' + '+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' + '+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs' ); interface Coordinates { latitude: number; longitude: number; } class NARImportService { /** * Convert NAR BG_X/BG_Y (EPSG:3347) to WGS84 lat/lng */ convertCoordinates(bgX: number, bgY: number): Coordinates | null { try { // Validate inputs if (!bgX || !bgY || bgX < 0 || bgY < 0) { logger.warn('Invalid BG_X/BG_Y coordinates:', { bgX, bgY }); return null; } // Convert: EPSG:3347 → WGS84 const [longitude, latitude] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]); // Validate output (Canada bounds) if ( latitude < 41.0 || latitude > 84.0 || // Canada latitude range longitude < -141.0 || longitude > -52.0 // Canada longitude range ) { logger.warn('Converted coordinates outside Canada:', { latitude, longitude }); return null; } return { latitude, longitude }; } catch (error) { logger.error('Coordinate conversion failed:', error); return null; } } /** * Get coordinates from NAR record (try BG_X/BG_Y, fallback to lat/lng) */ getCoordinates(narLocation: NARLocationRecord): Coordinates | null { // Primary: Convert BG_X/BG_Y if (narLocation.BG_X && narLocation.BG_Y) { const coords = this.convertCoordinates(narLocation.BG_X, narLocation.BG_Y); if (coords) return coords; } // Fallback: Use BG_LATITUDE/BG_LONGITUDE directly if (narLocation.BG_LATITUDE && narLocation.BG_LONGITUDE) { return { latitude: narLocation.BG_LATITUDE, longitude: narLocation.BG_LONGITUDE }; } return null; } } ``` ### Conversion Examples **Example 1: Toronto City Hall** ```typescript const bgX = 609091.8; const bgY = 4834610.7; const coords = convertCoordinates(bgX, bgY); // Result: { latitude: 43.6532, longitude: -79.3832 } ``` **Example 2: Parliament Hill, Ottawa** ```typescript const bgX = 447384.4; const bgY = 5030660.5; const coords = convertCoordinates(bgX, bgY); // Result: { latitude: 45.4236, longitude: -75.7009 } ``` **Example 3: Invalid Coordinates** ```typescript const bgX = -1000; // Negative (invalid) const bgY = 0; // Zero (invalid) const coords = convertCoordinates(bgX, bgY); // Result: null ``` ### Validation **Canada Bounds Check:** ```typescript const isWithinCanada = (lat: number, lng: number): boolean => { return ( lat >= 41.0 && lat <= 84.0 && // Latitude: Pelee Island to Alert lng >= -141.0 && lng <= -52.0 // Longitude: Yukon to Newfoundland ); }; ``` **Precision Check:** ```typescript // NAR coordinates should have 2-6 decimal places const hasValidPrecision = (value: number): boolean => { const str = value.toString(); const decimals = str.split('.')[1]?.length || 0; return decimals >= 2 && decimals <= 6; }; ``` ## Multi-Part File Handling ### Large Province Processing **Quebec (Province Code 24):** - 6 Address files: Address_24_part_1.csv through Address_24_part_6.csv - 1 Location file: Location_24.csv - Total records: ~850,000 **Ontario (Province Code 35):** - 3 Address files: Address_35_part_1.csv through Address_35_part_3.csv - 1 Location file: Location_35.csv - Total records: ~1,200,000 ### Sequential File Reading ```typescript // nar-import.service.ts async processAddressFiles(provinceCode: string): Promise> { const addressMap = new Map(); // Find all Address files for province const files = await fs.readdir(NAR_DATA_DIR); const addressFiles = files .filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`))) .sort(); // Ensure part_1, part_2, ... order logger.info(`Processing ${addressFiles.length} address files for province ${provinceCode}`); // Process each file sequentially for (const file of addressFiles) { logger.info(`Reading ${file}...`); const filePath = path.join(NAR_DATA_DIR, file); const stream = fs.createReadStream(filePath); const parser = stream.pipe(csvParser()); let rowCount = 0; for await (const row of parser) { const locGuid = row.LOC_GUID; if (!addressMap.has(locGuid)) { addressMap.set(locGuid, []); } addressMap.get(locGuid)!.push({ addrGuid: row.ADDR_GUID, civicNo: row.CIVIC_NO, streetName: row.OFFICIAL_STREET_NAME, postalCode: row.POSTAL_CODE, municipality: row.MUNICIPALITY }); rowCount++; if (rowCount % 10000 === 0) { logger.debug(`Processed ${rowCount} addresses from ${file}`); } } logger.info(`Completed ${file}: ${rowCount} addresses`); } logger.info(`Total unique locations: ${addressMap.size}`); return addressMap; } ``` ### Memory Management **Streaming Strategy:** ```typescript // Process files in chunks to avoid memory overflow async processInChunks( addressMap: Map, locationFile: string, batchSize: number = 500 ): Promise { const locationPath = path.join(NAR_DATA_DIR, locationFile); const stream = fs.createReadStream(locationPath); const parser = stream.pipe(csvParser()); let batch: LocationImport[] = []; let stats = { imported: 0, skipped: 0, errors: 0 }; for await (const row of parser) { const locGuid = row.LOC_GUID; const addresses = addressMap.get(locGuid); if (!addresses || addresses.length === 0) { stats.skipped++; continue; } // Apply filters if (!this.passesFilters(row, addresses)) { stats.skipped++; continue; } // Convert coordinates const coords = this.getCoordinates(row); if (!coords) { stats.errors++; continue; } batch.push({ location: row, addresses, coords }); // Import batch when full if (batch.length >= batchSize) { await this.importBatch(batch); stats.imported += batch.length; batch = []; } } // Import remaining if (batch.length > 0) { await this.importBatch(batch); stats.imported += batch.length; } return stats; } ``` **Batch Transaction:** ```typescript async importBatch(batch: LocationImport[]): Promise { await prisma.$transaction(async (tx) => { for (const item of batch) { // Upsert location const location = await tx.location.upsert({ where: { locGuid: item.location.LOC_GUID }, update: { address: this.formatAddress(item.addresses[0]), latitude: item.coords.latitude, longitude: item.coords.longitude, postalCode: item.addresses[0].postalCode, federalDistrict: item.location.FED_NUM, buildingUse: parseInt(item.location.BU_USE), municipality: item.location.MUNICIPALITY, geocodedAt: new Date() }, create: { locGuid: item.location.LOC_GUID, address: this.formatAddress(item.addresses[0]), latitude: item.coords.latitude, longitude: item.coords.longitude, postalCode: item.addresses[0].postalCode, federalDistrict: item.location.FED_NUM, buildingUse: parseInt(item.location.BU_USE), municipality: item.location.MUNICIPALITY, geocodeConfidence: 100, geocodeProvider: 'NAR', geocodedAt: new Date() } }); // Insert addresses for (const addr of item.addresses) { await tx.address.upsert({ where: { addrGuid: addr.addrGuid }, update: { locationId: location.id }, create: { addrGuid: addr.addrGuid, locationId: location.id, unitNumber: addr.civicNo } }); } } }); } ``` ## Code Examples ### LocationsPage - NAR Import Tab ```typescript // LocationsPage.tsx import React, { useEffect, useState } from 'react'; import { Tabs, Table, Button, Select, Input, Checkbox, Card, Progress, message } from 'antd'; import { UploadOutlined } from '@ant-design/icons'; import { api } from '@/lib/api'; const NARImportTab: React.FC = () => { const [datasets, setDatasets] = useState([]); const [selectedProvince, setSelectedProvince] = useState(null); const [filters, setFilters] = useState({ city: '', postalCodePrefix: '', cutId: null as number | null, residentialOnly: true }); const [importing, setImporting] = useState(false); const [progress, setProgress] = useState(null); const [jobId, setJobId] = useState(null); useEffect(() => { fetchDatasets(); }, []); useEffect(() => { if (jobId && importing) { const interval = setInterval(pollProgress, 2000); return () => clearInterval(interval); } }, [jobId, importing]); const fetchDatasets = async () => { try { const { data } = await api.get<{ datasets: NARDataset[] }>('/locations/nar/datasets'); setDatasets(data.datasets); } catch (error) { message.error('Failed to load NAR datasets'); } }; const pollProgress = async () => { if (!jobId) return; try { const { data } = await api.get(`/locations/nar/import/${jobId}`); if (data.status === 'completed') { setImporting(false); setProgress(null); message.success(`Import complete! Imported ${data.result.imported} locations.`); } else if (data.status === 'failed') { setImporting(false); setProgress(null); message.error('Import failed. Check logs for details.'); } else { setProgress(data.progress); } } catch (error) { message.error('Failed to fetch import progress'); } }; const startImport = async () => { if (!selectedProvince) { message.warning('Please select a province'); return; } try { const { data } = await api.post('/locations/nar/import', { provinceCode: selectedProvince, ...filters }); setJobId(data.jobId); setImporting(true); message.info('Import started...'); } catch (error) { message.error('Failed to start import'); } }; const datasetColumns = [ { title: 'Province', dataIndex: 'provinceName', key: 'name' }, { title: 'Files', dataIndex: 'addressFileCount', key: 'files' }, { title: 'Estimated Records', dataIndex: 'estimatedRecords', key: 'records', render: (val: number) => val.toLocaleString() }, { title: 'Last Modified', dataIndex: 'lastModified', key: 'modified', render: (val: string) => new Date(val).toLocaleDateString() } ]; return (
({ onClick: () => setSelectedProvince(record.provinceCode), style: { cursor: 'pointer', backgroundColor: selectedProvince === record.provinceCode ? '#e6f7ff' : undefined } })} /> {selectedProvince && (
{datasets.find(d => d.provinceCode === selectedProvince)?.provinceName}
setFilters({ ...filters, city: e.target.value.toUpperCase() })} />
setFilters({ ...filters, postalCodePrefix: e.target.value.toUpperCase() })} />
setFilters({ ...filters, residentialOnly: e.target.checked })} > Residential Only
)} {importing && progress && (

Processed: {progress.processed.toLocaleString()} / {progress.total.toLocaleString()}

Imported: {progress.imported.toLocaleString()}

Skipped: {progress.skipped.toLocaleString()}

Errors: {progress.errors.toLocaleString()}

)} ); }; ``` ### NAR Import Service - Full Implementation ```typescript // nar-import.service.ts import fs from 'fs/promises'; import path from 'path'; import csvParser from 'csv-parser'; import proj4 from 'proj4'; import { prisma } from '@/config/database'; import { logger } from '@/utils/logger'; // Define EPSG:3347 proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' + '+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' + '+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs' ); const NAR_DATA_DIR = process.env.NAR_DATA_DIR || '/data'; const BATCH_SIZE = parseInt(process.env.NAR_BATCH_SIZE || '500'); interface NARAddressRecord { ADDR_GUID: string; LOC_GUID: string; CIVIC_NO: string; OFFICIAL_STREET_NAME: string; POSTAL_CODE: string; MUNICIPALITY: string; } interface NARLocationRecord { LOC_GUID: string; BG_LATITUDE?: number; BG_LONGITUDE?: number; BG_X?: number; BG_Y?: number; FED_NUM: string; BU_USE: string; MUNICIPALITY: string; } export class NARImportService { async importProvince( provinceCode: string, filters: { city?: string; postalCodePrefix?: string; cutId?: number; residentialOnly?: boolean; } ): Promise { logger.info(`Starting NAR import for province ${provinceCode}`, { filters }); // Load address files into memory map const addressMap = await this.loadAddressFiles(provinceCode, filters); // Process location file and import const result = await this.processLocationFile(provinceCode, addressMap, filters); logger.info(`NAR import complete for province ${provinceCode}`, result); return result; } private async loadAddressFiles( provinceCode: string, filters: { city?: string; postalCodePrefix?: string } ): Promise> { const addressMap = new Map(); const files = await fs.readdir(NAR_DATA_DIR); const addressFiles = files .filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`))) .sort(); for (const file of addressFiles) { logger.info(`Reading ${file}...`); const filePath = path.join(NAR_DATA_DIR, file); const stream = require('fs').createReadStream(filePath); const parser = stream.pipe(csvParser()); for await (const row of parser) { // Apply filters if (filters.city && row.MUNICIPALITY !== filters.city) continue; if (filters.postalCodePrefix && !row.POSTAL_CODE.startsWith(filters.postalCodePrefix)) continue; const locGuid = row.LOC_GUID; if (!addressMap.has(locGuid)) { addressMap.set(locGuid, []); } addressMap.get(locGuid)!.push(row); } } logger.info(`Loaded ${addressMap.size} unique locations`); return addressMap; } private async processLocationFile( provinceCode: string, addressMap: Map, filters: { cutId?: number; residentialOnly?: boolean } ): Promise { const locationFile = `Location_${provinceCode}.csv`; const filePath = path.join(NAR_DATA_DIR, locationFile); const stream = require('fs').createReadStream(filePath); const parser = stream.pipe(csvParser()); let batch: any[] = []; const stats = { imported: 0, skipped: 0, errors: 0, total: 0 }; for await (const row of parser) { stats.total++; const locGuid = row.LOC_GUID; const addresses = addressMap.get(locGuid); if (!addresses || addresses.length === 0) { stats.skipped++; continue; } // Residential filter if (filters.residentialOnly && parseInt(row.BU_USE) !== 1) { stats.skipped++; continue; } // Convert coordinates const coords = this.getCoordinates(row); if (!coords) { stats.errors++; continue; } // Cut filter (if specified) if (filters.cutId) { const cut = await prisma.cut.findUnique({ where: { id: filters.cutId } }); if (cut && !this.isPointInPolygon([coords.longitude, coords.latitude], cut.geojson)) { stats.skipped++; continue; } } batch.push({ location: row, addresses, coords }); if (batch.length >= BATCH_SIZE) { await this.importBatch(batch); stats.imported += batch.length; batch = []; } } if (batch.length > 0) { await this.importBatch(batch); stats.imported += batch.length; } return stats; } private getCoordinates(row: NARLocationRecord): { latitude: number; longitude: number } | null { // Try BG_X/BG_Y conversion if (row.BG_X && row.BG_Y) { try { const [lng, lat] = proj4('EPSG:3347', 'WGS84', [row.BG_X, row.BG_Y]); if (lat >= 41 && lat <= 84 && lng >= -141 && lng <= -52) { return { latitude: lat, longitude: lng }; } } catch (error) { logger.warn('Coordinate conversion failed:', error); } } // Fallback to BG_LATITUDE/BG_LONGITUDE if (row.BG_LATITUDE && row.BG_LONGITUDE) { return { latitude: row.BG_LATITUDE, longitude: row.BG_LONGITUDE }; } return null; } private async importBatch(batch: any[]): Promise { await prisma.$transaction(async (tx) => { for (const item of batch) { const location = await tx.location.upsert({ where: { locGuid: item.location.LOC_GUID }, update: { address: this.formatAddress(item.addresses[0]), latitude: item.coords.latitude, longitude: item.coords.longitude, postalCode: item.addresses[0].POSTAL_CODE, federalDistrict: item.location.FED_NUM, buildingUse: parseInt(item.location.BU_USE), municipality: item.location.MUNICIPALITY }, create: { locGuid: item.location.LOC_GUID, address: this.formatAddress(item.addresses[0]), latitude: item.coords.latitude, longitude: item.coords.longitude, postalCode: item.addresses[0].POSTAL_CODE, federalDistrict: item.location.FED_NUM, buildingUse: parseInt(item.location.BU_USE), municipality: item.location.MUNICIPALITY, geocodeConfidence: 100, geocodeProvider: 'NAR' } }); for (const addr of item.addresses) { await tx.address.upsert({ where: { addrGuid: addr.ADDR_GUID }, update: {}, create: { addrGuid: addr.ADDR_GUID, locationId: location.id, unitNumber: addr.CIVIC_NO } }); } } }); } private formatAddress(addr: NARAddressRecord): string { return `${addr.CIVIC_NO} ${addr.OFFICIAL_STREET_NAME}`.trim(); } private isPointInPolygon(point: [number, number], geojson: any): boolean { // Point-in-polygon implementation // (Same as in spatial.ts) return true; // Placeholder } } ``` ## Troubleshooting ### Problem: No datasets found **Symptoms:** - GET /api/locations/nar/datasets returns empty array - "No datasets available" message in admin **Solutions:** 1. **Verify NAR_DATA_DIR path:** ```bash echo $NAR_DATA_DIR ls -la /data ``` 2. **Check Docker volume mount:** ```yaml # docker-compose.yml services: api: volumes: - ./data:/data:ro ``` 3. **Verify file naming convention:** ```bash # Correct: Address_35_part_1.csv Location_35.csv # Incorrect: address_35.csv # Lowercase Addresses_35.csv # Plural Address35.csv # No underscore ``` 4. **Check file permissions:** ```bash chmod 644 /data/Address_*.csv chmod 644 /data/Location_*.csv ``` ### Problem: Coordinate conversion errors **Symptoms:** - Many locations skipped during import - "Converted coordinates outside Canada" warnings - Null latitude/longitude in database **Solutions:** 1. **Verify BG_X/BG_Y values:** ```typescript // Valid range for Canada (EPSG:3347): // BG_X: ~400,000 to 3,000,000 // BG_Y: ~4,600,000 to 9,000,000 console.log('BG_X:', narRecord.BG_X); // Should be 6-7 digits console.log('BG_Y:', narRecord.BG_Y); // Should be 7 digits ``` 2. **Test with known coordinates:** ```typescript // Toronto City Hall const [lng, lat] = proj4('EPSG:3347', 'WGS84', [609091.8, 4834610.7]); console.log('Expected: 43.6532, -79.3832'); console.log('Got:', lat, lng); ``` 3. **Fallback to BG_LATITUDE/BG_LONGITUDE:** ```typescript // If BG_X/BG_Y missing or invalid, use lat/lng directly if (!coords && narRecord.BG_LATITUDE && narRecord.BG_LONGITUDE) { coords = { latitude: narRecord.BG_LATITUDE, longitude: narRecord.BG_LONGITUDE }; } ``` 4. **Check proj4 definition:** ```bash npm list proj4 # Ensure version 2.8.0+ ``` ### Problem: Import very slow (> 30min for 100k records) **Symptoms:** - Import hangs on large provinces - Memory usage grows over time - Database connection timeouts **Solutions:** 1. **Increase batch size:** ```env NAR_BATCH_SIZE=1000 # Default: 500 ``` 2. **Use streaming instead of loading all addresses:** ```typescript // DON'T do this (loads all into memory): const allAddresses = await readAllAddressFiles(); // DO this (stream and process incrementally): for await (const addressBatch of streamAddressFiles()) { processBatch(addressBatch); } ``` 3. **Optimize database indexes:** ```sql CREATE INDEX CONCURRENTLY idx_locations_loc_guid ON "Location"(locGuid); CREATE INDEX CONCURRENTLY idx_addresses_addr_guid ON "Address"(addrGuid); ``` 4. **Disable geocoding during import:** ```typescript // Skip geocoding service since NAR already has coordinates geocodeConfidence: 100, geocodeProvider: 'NAR' // No call to geocodingService.geocode() ``` 5. **Use worker threads for parallel processing:** ```typescript import { Worker } from 'worker_threads'; const workers = []; for (let i = 0; i < 4; i++) { const worker = new Worker('./nar-import-worker.js'); workers.push(worker); } ``` ### Problem: Duplicate LOC_GUID errors **Symptoms:** - Unique constraint violation on locGuid - Import fails mid-process - "Duplicate key value violates unique constraint" error **Solutions:** 1. **Use UPSERT instead of INSERT:** ```typescript await prisma.location.upsert({ where: { locGuid: narRecord.LOC_GUID }, update: { /* update fields */ }, create: { /* create fields */ } }); ``` 2. **Check for corrupt NAR files:** ```bash # Count unique LOC_GUIDs cut -d, -f2 Address_35_part_1.csv | sort | uniq | wc -l # Check for duplicates cut -d, -f2 Address_35_part_1.csv | sort | uniq -d ``` 3. **Clean up partial imports:** ```sql -- Delete locations from failed import DELETE FROM "Location" WHERE "geocodeProvider" = 'NAR' AND "createdAt" > '2025-02-13'; ``` 4. **Implement transaction rollback on error:** ```typescript try { await prisma.$transaction(async (tx) => { // Import batch }); } catch (error) { logger.error('Batch failed, rolling back:', error); // Transaction automatically rolled back } ``` ## Performance Considerations ### Import Speed **Benchmarks:** | Province | Records | Files | Time | Records/Second | |----------|---------|-------|------|----------------| | PEI (11) | 15,000 | 1 | 12s | 1,250 | | Nova Scotia (12) | 85,000 | 1 | 1m 10s | 1,214 | | Quebec (24) | 850,000 | 6 | 11m 20s | 1,250 | | Ontario (35) | 1,200,000 | 3 | 14m 30s | 1,379 | **Factors:** - Batch size: 500 (optimal for most systems) - Coordinate conversion: ~0.1ms per record - Database write: ~0.5ms per location (depends on disk speed) - Total overhead: ~0.7ms per record ### Memory Usage **Peak Memory:** - Address map (in-memory): ~200MB per 100k records - CSV parser buffer: ~10MB - Batch buffer: ~5MB (500 records) - Total: ~220MB per 100k records **Optimization:** - Stream address files instead of loading all - Process location file in chunks - Clear batch after each commit - Limit concurrent transactions ### Database Load **Transaction Rate:** - 1 transaction per batch (500 records) - ~2-3 transactions/second - Low database CPU (~10-20%) - Moderate disk I/O (sequential writes) **Connection Pool:** ```typescript // prisma/schema.prisma datasource db { url = env("DATABASE_URL") connection_limit = 10 } ``` ## Related Documentation ### Backend Documentation - **NAR Import Service:** `api/src/modules/map/locations/nar-import.service.ts` - File scanning - Streaming CSV parser - Coordinate conversion - Batch import - **NAR Import Routes:** `api/src/modules/map/locations/nar-import.routes.ts` - Dataset discovery - Import job creation - Progress tracking - **Locations Service:** `api/src/modules/map/locations/locations.service.ts` - Location CRUD - Geocoding integration ### Frontend Documentation - **Locations Page:** `admin/src/pages/LocationsPage.tsx` - NAR Import tab - Dataset selection - Filter configuration - Progress monitoring ### Database Documentation - **Location Model:** `api/prisma/schema.prisma` - NAR-specific fields - locGuid unique constraint - Federal district index - **Address Model:** `api/prisma/schema.prisma` - addrGuid unique constraint - Location foreign key ### External Resources - **Elections Canada NAR:** https://www.elections.ca/content.aspx?section=res&dir=cir/tech/nar&document=index&lang=e - **EPSG:3347 Definition:** https://epsg.io/3347 - **Proj4 Documentation:** https://github.com/proj4js/proj4js - **NAR Data Dictionary:** Elections Canada NAR Technical Documentation (PDF)