47 KiB
NAR Import System
Overview
The National Address Register (NAR) import system enables bulk import of Canadian electoral data from Elections Canada. The system supports the 2025 NAR format with server-side streaming import, coordinate projection conversion, and comprehensive filtering options.
Key Features:
- Server-side streaming import (handles large datasets)
- NAR 2025 format support (BG_X/BG_Y Lambert projection)
- Address + Location file joining on LOC_GUID
- Proj4 coordinate conversion (EPSG:3347 → WGS84)
- Province selector (13 provinces/territories)
- Filtering: city, postal code, cut boundary, residential-only
- Multi-part file handling (large provinces)
- Progress tracking and error reporting
- Import statistics and validation
Use Cases:
- Initial campaign database setup
- Electoral district targeting
- NAR data updates (new redistribution)
- Multi-region campaign expansion
- Address database verification
Architecture Highlights:
- Streaming CSV parser (avoids memory limits)
- File-based LOC_GUID join
- Real-time coordinate projection
- Point-in-polygon cut filtering
- Transaction batching (500 records/commit)
- Duplicate prevention via UPSERT
Architecture
flowchart TB
subgraph Admin Interface
Admin[Admin User]
LocationsPage[LocationsPage - NAR Tab]
end
subgraph API Layer
DatasetsAPI["/api/locations/nar/datasets"]
ImportAPI["/api/locations/nar/import"]
end
subgraph NAR Import Service
Scanner[File Scanner]
Reader[CSV Stream Reader]
Joiner[Address+Location Joiner]
Converter[Coordinate Converter]
Filter[Filter Pipeline]
Importer[Bulk Importer]
end
subgraph File System
DataDir[/data/NAR Files]
AddressFiles[Address_XX_part_*.csv]
LocationFiles[Location_XX.csv]
end
subgraph Database
LocationsDB[(Locations)]
AddressesDB[(Addresses)]
end
subgraph External Services
Proj4[Proj4 Library]
EPSG3347[EPSG:3347 Definition]
end
Admin --> LocationsPage
LocationsPage --> DatasetsAPI
LocationsPage --> ImportAPI
DatasetsAPI --> Scanner
Scanner --> DataDir
ImportAPI --> Reader
Reader --> AddressFiles
Reader --> LocationFiles
Reader --> Joiner
Joiner --> Converter
Converter --> Proj4
Proj4 --> EPSG3347
Converter --> Filter
Filter --> Importer
Importer --> LocationsDB
Importer --> AddressesDB
Data Flow:
-
Dataset Discovery:
- Scan /data directory for NAR CSV files
- Group by province code (10-62)
- Identify multi-part Address files
- Return available datasets
-
Import Initiation:
- Admin selects province + filters
- API creates import job
- Begins streaming CSV files
-
File Processing:
- Read Address files (all parts sequentially)
- Read Location file (parallel)
- Join on LOC_GUID (in-memory map)
-
Coordinate Conversion:
- Extract BG_X/BG_Y from Location file
- Convert EPSG:3347 → WGS84 using Proj4
- Fallback to BG_LATITUDE/BG_LONGITUDE if conversion fails
-
Filtering:
- City filter (exact match on MUNICIPALITY)
- Postal code filter (prefix match)
- Cut filter (point-in-polygon)
- Residential filter (BU_USE = 1)
-
Database Import:
- UPSERT Locations by locGuid (prevent duplicates)
- INSERT Addresses with foreign key
- Batch commits (500 records)
- Track progress and errors
NAR File Format
File Structure
Directory Layout:
/data/
├── Address_10.csv # Newfoundland
├── Address_11.csv # PEI
├── Address_12.csv # Nova Scotia
├── Address_13.csv # New Brunswick
├── Address_24_part_1.csv # Quebec (multi-part)
├── Address_24_part_2.csv
├── Address_24_part_3.csv
├── Address_24_part_4.csv
├── Address_24_part_5.csv
├── Address_24_part_6.csv
├── Address_35_part_1.csv # Ontario (multi-part)
├── Address_35_part_2.csv
├── ...
├── Location_10.csv
├── Location_11.csv
├── Location_12.csv
├── Location_13.csv
├── Location_24.csv
├── Location_35.csv
└── ...
Address File Schema
File: Address_XX_part_Y.csv
ADDR_GUID,LOC_GUID,CIVIC_NO,OFFICIAL_STREET_NAME,POSTAL_CODE,MUNICIPALITY,PROVINCE_CODE
{uuid},{uuid},123,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},125,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},127,MAIN ST,M5H2N2,TORONTO,35
Key Fields:
| Field | Type | Description | Example |
|---|---|---|---|
| ADDR_GUID | UUID | Unique address identifier | {12345678-...} |
| LOC_GUID | UUID | Location identifier (FK) | {87654321-...} |
| CIVIC_NO | String | Street number | 123, 123A, 123-125 |
| OFFICIAL_STREET_NAME | String | Street name (uppercase) | MAIN ST, YONGE ST |
| POSTAL_CODE | String | Canadian postal code (no space) | M5H2N2, K1A0B1 |
| MUNICIPALITY | String | City/town name | TORONTO, OTTAWA |
| PROVINCE_CODE | Integer | Province code (10-62) | 35 (Ontario) |
Record Count:
- Small provinces: 10k-50k addresses
- Medium provinces: 50k-200k addresses
- Large provinces: 200k-1M+ addresses (multi-part files)
Location File Schema
File: Location_XX.csv
LOC_GUID,BG_LATITUDE,BG_LONGITUDE,BG_X,BG_Y,FED_NUM,BU_USE,MUNICIPALITY
{uuid},43.6532,-79.3832,1234567.89,234567.89,35001,1,TORONTO
{uuid},43.6540,-79.3825,1234600.00,234600.00,35001,1,TORONTO
Key Fields:
| Field | Type | Description | Example |
|---|---|---|---|
| LOC_GUID | UUID | Unique location identifier | {87654321-...} |
| BG_LATITUDE | Float | Latitude (WGS84) | 43.6532 |
| BG_LONGITUDE | Float | Longitude (WGS84) | -79.3832 |
| BG_X | Float | X coord (EPSG:3347 Lambert) | 1234567.89 |
| BG_Y | Float | Y coord (EPSG:3347 Lambert) | 234567.89 |
| FED_NUM | String | Federal electoral district | 35001, 24050 |
| BU_USE | Integer | Building use code | 1 = Residential |
| MUNICIPALITY | String | City/town name | TORONTO |
Coordinate Systems:
- BG_LATITUDE/BG_LONGITUDE: WGS84 decimal degrees (EPSG:4326)
- BG_X/BG_Y: Statistics Canada Lambert Conformal Conic (EPSG:3347)
- 2025 NAR Change: Primary coordinates shifted from lat/lng to BG_X/BG_Y
Building Use Codes:
| Code | Description |
|---|---|
| 1 | Residential |
| 2 | Commercial |
| 3 | Industrial |
| 4 | Institutional |
| 5 | Parks/Recreation |
| 9 | Other |
Database Models
Location Model Extensions
model Location {
id Int @id @default(autoincrement())
address String
latitude Float?
longitude Float?
postalCode String?
province String?
// NAR-specific fields
locGuid String? @unique // NAR LOC_GUID (UUID)
federalDistrict String? // NAR FED_NUM
buildingUse Int? // NAR BU_USE code
municipality String? // NAR MUNICIPALITY
// Geocoding metadata (populated during import)
geocodeConfidence Int? @default(100) // NAR = high confidence
geocodeProvider String? @default("NAR")
geocodedAt DateTime?
addresses Address[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([locGuid])
@@index([federalDistrict])
@@index([buildingUse])
@@index([postalCode])
}
Address Model Extensions
model Address {
id Int @id @default(autoincrement())
locationId Int
location Location @relation(fields: [locationId], references: [id], onDelete: Cascade)
// NAR-specific fields
addrGuid String? @unique // NAR ADDR_GUID (UUID)
unitNumber String? // NAR CIVIC_NO (if multi-unit)
// Voter data (future)
firstName String?
lastName String?
supportLevel Int?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([locationId])
@@index([addrGuid])
}
UPSERT Strategy:
// Prevent duplicates on re-import
const location = await prisma.location.upsert({
where: { locGuid: narRecord.LOC_GUID },
update: {
address: narRecord.addressString,
latitude: coords.latitude,
longitude: coords.longitude,
postalCode: narRecord.POSTAL_CODE,
province: provinceMap[narRecord.PROVINCE_CODE],
federalDistrict: narRecord.FED_NUM,
buildingUse: narRecord.BU_USE,
municipality: narRecord.MUNICIPALITY,
geocodeProvider: 'NAR',
geocodedAt: new Date()
},
create: {
locGuid: narRecord.LOC_GUID,
address: narRecord.addressString,
latitude: coords.latitude,
longitude: coords.longitude,
postalCode: narRecord.POSTAL_CODE,
province: provinceMap[narRecord.PROVINCE_CODE],
federalDistrict: narRecord.FED_NUM,
buildingUse: narRecord.BU_USE,
municipality: narRecord.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR',
geocodedAt: new Date()
}
});
API Endpoints
GET /api/locations/nar/datasets
Scan NAR data directory and return available province datasets.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Response:
{
"datasets": [
{
"provinceCode": "10",
"provinceName": "Newfoundland and Labrador",
"addressFiles": ["Address_10.csv"],
"locationFile": "Location_10.csv",
"addressFileCount": 1,
"estimatedRecords": 15000,
"lastModified": "2025-01-15T00:00:00Z"
},
{
"provinceCode": "24",
"provinceName": "Quebec",
"addressFiles": [
"Address_24_part_1.csv",
"Address_24_part_2.csv",
"Address_24_part_3.csv",
"Address_24_part_4.csv",
"Address_24_part_5.csv",
"Address_24_part_6.csv"
],
"locationFile": "Location_24.csv",
"addressFileCount": 6,
"estimatedRecords": 850000,
"lastModified": "2025-01-20T00:00:00Z"
},
{
"provinceCode": "35",
"provinceName": "Ontario",
"addressFiles": [
"Address_35_part_1.csv",
"Address_35_part_2.csv",
"Address_35_part_3.csv"
],
"locationFile": "Location_35.csv",
"addressFileCount": 3,
"estimatedRecords": 1200000,
"lastModified": "2025-01-22T00:00:00Z"
}
],
"dataDir": "/data",
"totalDatasets": 13
}
Implementation:
// nar-import.service.ts
async scanDatasets(): Promise<NARDataset[]> {
const files = await fs.readdir(NAR_DATA_DIR);
// Group files by province code
const provinceGroups: Record<string, { address: string[], location: string }> = {};
files.forEach(file => {
const addressMatch = file.match(/^Address_(\d+)(?:_part_\d+)?\.csv$/);
const locationMatch = file.match(/^Location_(\d+)\.csv$/);
if (addressMatch) {
const code = addressMatch[1];
if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
provinceGroups[code].address.push(file);
} else if (locationMatch) {
const code = locationMatch[1];
if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
provinceGroups[code].location = file;
}
});
// Build dataset objects
const datasets: NARDataset[] = [];
for (const [code, group] of Object.entries(provinceGroups)) {
if (group.address.length === 0 || !group.location) continue;
const stats = await fs.stat(path.join(NAR_DATA_DIR, group.location));
datasets.push({
provinceCode: code,
provinceName: PROVINCE_NAMES[code],
addressFiles: group.address.sort(),
locationFile: group.location,
addressFileCount: group.address.length,
estimatedRecords: await this.estimateRecordCount(group.address),
lastModified: stats.mtime.toISOString()
});
}
return datasets.sort((a, b) => a.provinceCode.localeCompare(b.provinceCode));
}
POST /api/locations/nar/import
Start NAR import job with filters.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Request Body:
{
"provinceCode": "35",
"city": "TORONTO",
"postalCodePrefix": "M5",
"cutId": 42,
"residentialOnly": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| provinceCode | string | Yes | Province code (10-62) |
| city | string | No | Filter by MUNICIPALITY (exact match, uppercase) |
| postalCodePrefix | string | No | Filter by postal code prefix (e.g., "M5", "K1A") |
| cutId | number | No | Filter by cut boundary (point-in-polygon) |
| residentialOnly | boolean | No | Only import BU_USE = 1 (default: false) |
Response:
{
"jobId": "nar-import-35-20250213-103000",
"status": "processing",
"provinceCode": "35",
"provinceName": "Ontario",
"filters": {
"city": "TORONTO",
"postalCodePrefix": "M5",
"cutId": 42,
"residentialOnly": true
},
"startedAt": "2025-02-13T10:30:00Z",
"estimatedCompletion": "2025-02-13T10:45:00Z"
}
GET /api/locations/nar/import/:jobId
Check import job progress.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Response (In Progress):
{
"jobId": "nar-import-35-20250213-103000",
"status": "processing",
"progress": {
"total": 1200000,
"processed": 600000,
"imported": 580000,
"skipped": 15000,
"errors": 5000,
"percent": 50.0
},
"currentFile": "Address_35_part_2.csv",
"startedAt": "2025-02-13T10:30:00Z",
"estimatedCompletion": "2025-02-13T10:45:00Z"
}
Response (Complete):
{
"jobId": "nar-import-35-20250213-103000",
"status": "completed",
"result": {
"total": 1200000,
"processed": 1200000,
"imported": 1150000,
"skipped": 45000,
"errors": 5000,
"percent": 100.0
},
"statistics": {
"locationsCreated": 800000,
"locationsUpdated": 350000,
"addressesCreated": 1150000,
"avgConfidence": 100,
"processingTime": "14m 32s"
},
"startedAt": "2025-02-13T10:30:00Z",
"completedAt": "2025-02-13T10:44:32Z"
}
Status Values:
queued: Job created, waiting to startprocessing: Import in progresscompleted: Import finished successfullyfailed: Import failed with errorscancelled: Import cancelled by user
Configuration
Environment Variables
| Variable | Type | Default | Description |
|---|---|---|---|
| NAR_DATA_DIR | string | /data | Directory containing NAR CSV files |
| NAR_BATCH_SIZE | number | 500 | Records per database transaction |
| NAR_IMPORT_TIMEOUT | number | 3600000 | Import timeout in ms (1 hour) |
Province Codes
Complete mapping of NAR province codes:
// nar-import.service.ts
const PROVINCE_NAMES: Record<string, string> = {
'10': 'Newfoundland and Labrador',
'11': 'Prince Edward Island',
'12': 'Nova Scotia',
'13': 'New Brunswick',
'24': 'Quebec',
'35': 'Ontario',
'46': 'Manitoba',
'47': 'Saskatchewan',
'48': 'Alberta',
'59': 'British Columbia',
'60': 'Yukon',
'61': 'Northwest Territories',
'62': 'Nunavut'
};
const PROVINCE_ABBREVIATIONS: Record<string, string> = {
'10': 'NL',
'11': 'PE',
'12': 'NS',
'13': 'NB',
'24': 'QC',
'35': 'ON',
'46': 'MB',
'47': 'SK',
'48': 'AB',
'59': 'BC',
'60': 'YT',
'61': 'NT',
'62': 'NU'
};
Coordinate Projection
EPSG:3347 Definition (Statistics Canada Lambert Conformal Conic):
import proj4 from 'proj4';
// Define EPSG:3347 projection
proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 +lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs');
// Convert function
const convertCoordinates = (bgX: number, bgY: number): [number, number] => {
// Input: [X, Y] in EPSG:3347 (meters)
// Output: [longitude, latitude] in WGS84 (degrees)
return proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
};
Projection Parameters:
- Type: Lambert Conformal Conic
- Standard Parallels: 49°N, 77°N
- Central Meridian: -91.866667°
- Origin: 63.390675°N, -91.866667°W
- False Easting: 6,200,000 m
- False Northing: 3,000,000 m
- Ellipsoid: GRS80
- Units: Meters
Example Conversion:
// Toronto City Hall coordinates
const bgX = 609091.8; // EPSG:3347 X
const bgY = 4834610.7; // EPSG:3347 Y
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
// Result: lng = -79.3832, lat = 43.6532
Import Workflow
Prepare NAR Files
Step 1: Download NAR Data
- Visit Elections Canada NAR portal: https://www.elections.ca/NAR
- Select "2025 National Address Register"
- Download province-specific CSV files
- Extract ZIP archives
Step 2: Upload Files to Server
# Create data directory if not exists
mkdir -p /path/to/data
# Upload files via SCP
scp Address_35_*.csv user@server:/path/to/data/
scp Location_35.csv user@server:/path/to/data/
# Or mount volume in Docker
# docker-compose.yml:
volumes:
- ./data:/data:ro
Step 3: Verify File Integrity
# Check file count
ls -l /path/to/data/Address_35_*.csv | wc -l
# Check Location file exists
ls -l /path/to/data/Location_35.csv
# Sample first few rows
head -5 /path/to/data/Address_35_part_1.csv
head -5 /path/to/data/Location_35.csv
Run Import via Admin UI
Step 1: Navigate to NAR Import Tab
- Log in as SUPER_ADMIN or MAP_ADMIN
- Click Map → Locations in sidebar
- Click NAR Import tab
- Available datasets load automatically
Step 2: Select Province
┌─────────────────────────────────────────┐
│ Available NAR Datasets │
├─────────────────────────────────────────┤
│ Province │ Files │ Records │
├──────────────────┼───────┼──────────────┤
│ Ontario (35) │ 3 │ 1,200,000 │
│ Quebec (24) │ 6 │ 850,000 │
│ Alberta (48) │ 2 │ 450,000 │
└──────────────────┴───────┴──────────────┘
[Select Province: Ontario ▼]
Step 3: Configure Filters (Optional)
Filters (Optional):
City: [TORONTO ]
Filter by exact municipality name (uppercase)
Postal Code Prefix: [M5 ]
Filter by postal code prefix (2-3 chars)
Cut Boundary: [Downtown Core ▼ ]
Only import locations within cut polygon
☑ Residential Only
Only import buildings with BU_USE = 1
Step 4: Review Import Summary
Import Summary:
Province: Ontario (35)
Files: Address_35_part_1.csv
Address_35_part_2.csv
Address_35_part_3.csv
Location_35.csv
Filters:
City: TORONTO
Postal Code: M5
Cut: Downtown Core
Residential Only: Yes
Estimated Records: ~50,000 (after filters)
Estimated Time: ~3 minutes
[Cancel] [Start Import]
Step 5: Monitor Progress
Import in Progress...
Current File: Address_35_part_2.csv
Progress: 600,000 / 1,200,000 (50%)
[████████████░░░░░░░░░░░░] 50%
Statistics:
Processed: 600,000
Imported: 580,000
Skipped: 15,000
Errors: 5,000
[Cancel Import]
Step 6: Review Results
Import Complete!
Final Statistics:
Total Processed: 1,200,000
Successfully Imported: 1,150,000
Skipped (Filters): 45,000
Errors: 5,000
Details:
Locations Created: 800,000
Locations Updated: 350,000
Addresses Created: 1,150,000
Processing Time: 14m 32s
Avg Records/Second: 1,375
[View Import Log] [Import Another Province] [Close]
Import via API
Step 1: Get Available Datasets
curl -X GET http://localhost:4000/api/locations/nar/datasets \
-H "Authorization: Bearer $TOKEN"
Step 2: Start Import
curl -X POST http://localhost:4000/api/locations/nar/import \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"provinceCode": "35",
"city": "TORONTO",
"postalCodePrefix": "M5",
"residentialOnly": true
}'
Step 3: Poll Job Status
JOB_ID="nar-import-35-20250213-103000"
while true; do
STATUS=$(curl -s -X GET \
http://localhost:4000/api/locations/nar/import/$JOB_ID \
-H "Authorization: Bearer $TOKEN" \
| jq -r '.status')
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
break
fi
sleep 5
done
# Get final result
curl -X GET http://localhost:4000/api/locations/nar/import/$JOB_ID \
-H "Authorization: Bearer $TOKEN" | jq
Coordinate Conversion
Proj4 Integration
Installation:
npm install proj4
# TypeScript types included in package
Service Implementation:
// nar-import.service.ts
import proj4 from 'proj4';
// Define EPSG:3347 (Statistics Canada Lambert)
proj4.defs('EPSG:3347',
'+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
'+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
'+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);
interface Coordinates {
latitude: number;
longitude: number;
}
class NARImportService {
/**
* Convert NAR BG_X/BG_Y (EPSG:3347) to WGS84 lat/lng
*/
convertCoordinates(bgX: number, bgY: number): Coordinates | null {
try {
// Validate inputs
if (!bgX || !bgY || bgX < 0 || bgY < 0) {
logger.warn('Invalid BG_X/BG_Y coordinates:', { bgX, bgY });
return null;
}
// Convert: EPSG:3347 → WGS84
const [longitude, latitude] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
// Validate output (Canada bounds)
if (
latitude < 41.0 || latitude > 84.0 || // Canada latitude range
longitude < -141.0 || longitude > -52.0 // Canada longitude range
) {
logger.warn('Converted coordinates outside Canada:', { latitude, longitude });
return null;
}
return { latitude, longitude };
} catch (error) {
logger.error('Coordinate conversion failed:', error);
return null;
}
}
/**
* Get coordinates from NAR record (try BG_X/BG_Y, fallback to lat/lng)
*/
getCoordinates(narLocation: NARLocationRecord): Coordinates | null {
// Primary: Convert BG_X/BG_Y
if (narLocation.BG_X && narLocation.BG_Y) {
const coords = this.convertCoordinates(narLocation.BG_X, narLocation.BG_Y);
if (coords) return coords;
}
// Fallback: Use BG_LATITUDE/BG_LONGITUDE directly
if (narLocation.BG_LATITUDE && narLocation.BG_LONGITUDE) {
return {
latitude: narLocation.BG_LATITUDE,
longitude: narLocation.BG_LONGITUDE
};
}
return null;
}
}
Conversion Examples
Example 1: Toronto City Hall
const bgX = 609091.8;
const bgY = 4834610.7;
const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 43.6532, longitude: -79.3832 }
Example 2: Parliament Hill, Ottawa
const bgX = 447384.4;
const bgY = 5030660.5;
const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 45.4236, longitude: -75.7009 }
Example 3: Invalid Coordinates
const bgX = -1000; // Negative (invalid)
const bgY = 0; // Zero (invalid)
const coords = convertCoordinates(bgX, bgY);
// Result: null
Validation
Canada Bounds Check:
const isWithinCanada = (lat: number, lng: number): boolean => {
return (
lat >= 41.0 && lat <= 84.0 && // Latitude: Pelee Island to Alert
lng >= -141.0 && lng <= -52.0 // Longitude: Yukon to Newfoundland
);
};
Precision Check:
// NAR coordinates should have 2-6 decimal places
const hasValidPrecision = (value: number): boolean => {
const str = value.toString();
const decimals = str.split('.')[1]?.length || 0;
return decimals >= 2 && decimals <= 6;
};
Multi-Part File Handling
Large Province Processing
Quebec (Province Code 24):
- 6 Address files: Address_24_part_1.csv through Address_24_part_6.csv
- 1 Location file: Location_24.csv
- Total records: ~850,000
Ontario (Province Code 35):
- 3 Address files: Address_35_part_1.csv through Address_35_part_3.csv
- 1 Location file: Location_35.csv
- Total records: ~1,200,000
Sequential File Reading
// nar-import.service.ts
async processAddressFiles(provinceCode: string): Promise<Map<string, AddressRecord[]>> {
const addressMap = new Map<string, AddressRecord[]>();
// Find all Address files for province
const files = await fs.readdir(NAR_DATA_DIR);
const addressFiles = files
.filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
.sort(); // Ensure part_1, part_2, ... order
logger.info(`Processing ${addressFiles.length} address files for province ${provinceCode}`);
// Process each file sequentially
for (const file of addressFiles) {
logger.info(`Reading ${file}...`);
const filePath = path.join(NAR_DATA_DIR, file);
const stream = fs.createReadStream(filePath);
const parser = stream.pipe(csvParser());
let rowCount = 0;
for await (const row of parser) {
const locGuid = row.LOC_GUID;
if (!addressMap.has(locGuid)) {
addressMap.set(locGuid, []);
}
addressMap.get(locGuid)!.push({
addrGuid: row.ADDR_GUID,
civicNo: row.CIVIC_NO,
streetName: row.OFFICIAL_STREET_NAME,
postalCode: row.POSTAL_CODE,
municipality: row.MUNICIPALITY
});
rowCount++;
if (rowCount % 10000 === 0) {
logger.debug(`Processed ${rowCount} addresses from ${file}`);
}
}
logger.info(`Completed ${file}: ${rowCount} addresses`);
}
logger.info(`Total unique locations: ${addressMap.size}`);
return addressMap;
}
Memory Management
Streaming Strategy:
// Process files in chunks to avoid memory overflow
async processInChunks(
addressMap: Map<string, AddressRecord[]>,
locationFile: string,
batchSize: number = 500
): Promise<ImportResult> {
const locationPath = path.join(NAR_DATA_DIR, locationFile);
const stream = fs.createReadStream(locationPath);
const parser = stream.pipe(csvParser());
let batch: LocationImport[] = [];
let stats = { imported: 0, skipped: 0, errors: 0 };
for await (const row of parser) {
const locGuid = row.LOC_GUID;
const addresses = addressMap.get(locGuid);
if (!addresses || addresses.length === 0) {
stats.skipped++;
continue;
}
// Apply filters
if (!this.passesFilters(row, addresses)) {
stats.skipped++;
continue;
}
// Convert coordinates
const coords = this.getCoordinates(row);
if (!coords) {
stats.errors++;
continue;
}
batch.push({ location: row, addresses, coords });
// Import batch when full
if (batch.length >= batchSize) {
await this.importBatch(batch);
stats.imported += batch.length;
batch = [];
}
}
// Import remaining
if (batch.length > 0) {
await this.importBatch(batch);
stats.imported += batch.length;
}
return stats;
}
Batch Transaction:
async importBatch(batch: LocationImport[]): Promise<void> {
await prisma.$transaction(async (tx) => {
for (const item of batch) {
// Upsert location
const location = await tx.location.upsert({
where: { locGuid: item.location.LOC_GUID },
update: {
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].postalCode,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodedAt: new Date()
},
create: {
locGuid: item.location.LOC_GUID,
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].postalCode,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR',
geocodedAt: new Date()
}
});
// Insert addresses
for (const addr of item.addresses) {
await tx.address.upsert({
where: { addrGuid: addr.addrGuid },
update: { locationId: location.id },
create: {
addrGuid: addr.addrGuid,
locationId: location.id,
unitNumber: addr.civicNo
}
});
}
}
});
}
Code Examples
LocationsPage - NAR Import Tab
// LocationsPage.tsx
import React, { useEffect, useState } from 'react';
import { Tabs, Table, Button, Select, Input, Checkbox, Card, Progress, message } from 'antd';
import { UploadOutlined } from '@ant-design/icons';
import { api } from '@/lib/api';
const NARImportTab: React.FC = () => {
const [datasets, setDatasets] = useState<NARDataset[]>([]);
const [selectedProvince, setSelectedProvince] = useState<string | null>(null);
const [filters, setFilters] = useState({
city: '',
postalCodePrefix: '',
cutId: null as number | null,
residentialOnly: true
});
const [importing, setImporting] = useState(false);
const [progress, setProgress] = useState<ImportProgress | null>(null);
const [jobId, setJobId] = useState<string | null>(null);
useEffect(() => {
fetchDatasets();
}, []);
useEffect(() => {
if (jobId && importing) {
const interval = setInterval(pollProgress, 2000);
return () => clearInterval(interval);
}
}, [jobId, importing]);
const fetchDatasets = async () => {
try {
const { data } = await api.get<{ datasets: NARDataset[] }>('/locations/nar/datasets');
setDatasets(data.datasets);
} catch (error) {
message.error('Failed to load NAR datasets');
}
};
const pollProgress = async () => {
if (!jobId) return;
try {
const { data } = await api.get(`/locations/nar/import/${jobId}`);
if (data.status === 'completed') {
setImporting(false);
setProgress(null);
message.success(`Import complete! Imported ${data.result.imported} locations.`);
} else if (data.status === 'failed') {
setImporting(false);
setProgress(null);
message.error('Import failed. Check logs for details.');
} else {
setProgress(data.progress);
}
} catch (error) {
message.error('Failed to fetch import progress');
}
};
const startImport = async () => {
if (!selectedProvince) {
message.warning('Please select a province');
return;
}
try {
const { data } = await api.post('/locations/nar/import', {
provinceCode: selectedProvince,
...filters
});
setJobId(data.jobId);
setImporting(true);
message.info('Import started...');
} catch (error) {
message.error('Failed to start import');
}
};
const datasetColumns = [
{ title: 'Province', dataIndex: 'provinceName', key: 'name' },
{ title: 'Files', dataIndex: 'addressFileCount', key: 'files' },
{ title: 'Estimated Records', dataIndex: 'estimatedRecords', key: 'records',
render: (val: number) => val.toLocaleString() },
{ title: 'Last Modified', dataIndex: 'lastModified', key: 'modified',
render: (val: string) => new Date(val).toLocaleDateString() }
];
return (
<div>
<Card title="Available NAR Datasets" style={{ marginBottom: 24 }}>
<Table
dataSource={datasets}
columns={datasetColumns}
rowKey="provinceCode"
pagination={false}
onRow={(record) => ({
onClick: () => setSelectedProvince(record.provinceCode),
style: {
cursor: 'pointer',
backgroundColor: selectedProvince === record.provinceCode ? '#e6f7ff' : undefined
}
})}
/>
</Card>
{selectedProvince && (
<Card title="Import Configuration">
<div style={{ marginBottom: 16 }}>
<label>Province: </label>
<strong>{datasets.find(d => d.provinceCode === selectedProvince)?.provinceName}</strong>
</div>
<div style={{ marginBottom: 16 }}>
<label>City (Optional): </label>
<Input
style={{ width: 300 }}
placeholder="TORONTO"
value={filters.city}
onChange={e => setFilters({ ...filters, city: e.target.value.toUpperCase() })}
/>
</div>
<div style={{ marginBottom: 16 }}>
<label>Postal Code Prefix (Optional): </label>
<Input
style={{ width: 200 }}
placeholder="M5"
value={filters.postalCodePrefix}
onChange={e => setFilters({ ...filters, postalCodePrefix: e.target.value.toUpperCase() })}
/>
</div>
<div style={{ marginBottom: 16 }}>
<Checkbox
checked={filters.residentialOnly}
onChange={e => setFilters({ ...filters, residentialOnly: e.target.checked })}
>
Residential Only
</Checkbox>
</div>
<Button
type="primary"
icon={<UploadOutlined />}
onClick={startImport}
loading={importing}
disabled={importing}
>
Start Import
</Button>
</Card>
)}
{importing && progress && (
<Card title="Import Progress" style={{ marginTop: 24 }}>
<Progress percent={progress.percent} status="active" />
<div style={{ marginTop: 16 }}>
<p>Processed: {progress.processed.toLocaleString()} / {progress.total.toLocaleString()}</p>
<p>Imported: {progress.imported.toLocaleString()}</p>
<p>Skipped: {progress.skipped.toLocaleString()}</p>
<p>Errors: {progress.errors.toLocaleString()}</p>
</div>
</Card>
)}
</div>
);
};
NAR Import Service - Full Implementation
// nar-import.service.ts
import fs from 'fs/promises';
import path from 'path';
import csvParser from 'csv-parser';
import proj4 from 'proj4';
import { prisma } from '@/config/database';
import { logger } from '@/utils/logger';
// Define EPSG:3347
proj4.defs('EPSG:3347',
'+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
'+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
'+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);
const NAR_DATA_DIR = process.env.NAR_DATA_DIR || '/data';
const BATCH_SIZE = parseInt(process.env.NAR_BATCH_SIZE || '500');
interface NARAddressRecord {
ADDR_GUID: string;
LOC_GUID: string;
CIVIC_NO: string;
OFFICIAL_STREET_NAME: string;
POSTAL_CODE: string;
MUNICIPALITY: string;
}
interface NARLocationRecord {
LOC_GUID: string;
BG_LATITUDE?: number;
BG_LONGITUDE?: number;
BG_X?: number;
BG_Y?: number;
FED_NUM: string;
BU_USE: string;
MUNICIPALITY: string;
}
export class NARImportService {
async importProvince(
provinceCode: string,
filters: {
city?: string;
postalCodePrefix?: string;
cutId?: number;
residentialOnly?: boolean;
}
): Promise<ImportResult> {
logger.info(`Starting NAR import for province ${provinceCode}`, { filters });
// Load address files into memory map
const addressMap = await this.loadAddressFiles(provinceCode, filters);
// Process location file and import
const result = await this.processLocationFile(provinceCode, addressMap, filters);
logger.info(`NAR import complete for province ${provinceCode}`, result);
return result;
}
private async loadAddressFiles(
provinceCode: string,
filters: { city?: string; postalCodePrefix?: string }
): Promise<Map<string, NARAddressRecord[]>> {
const addressMap = new Map<string, NARAddressRecord[]>();
const files = await fs.readdir(NAR_DATA_DIR);
const addressFiles = files
.filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
.sort();
for (const file of addressFiles) {
logger.info(`Reading ${file}...`);
const filePath = path.join(NAR_DATA_DIR, file);
const stream = require('fs').createReadStream(filePath);
const parser = stream.pipe(csvParser());
for await (const row of parser) {
// Apply filters
if (filters.city && row.MUNICIPALITY !== filters.city) continue;
if (filters.postalCodePrefix && !row.POSTAL_CODE.startsWith(filters.postalCodePrefix)) continue;
const locGuid = row.LOC_GUID;
if (!addressMap.has(locGuid)) {
addressMap.set(locGuid, []);
}
addressMap.get(locGuid)!.push(row);
}
}
logger.info(`Loaded ${addressMap.size} unique locations`);
return addressMap;
}
private async processLocationFile(
provinceCode: string,
addressMap: Map<string, NARAddressRecord[]>,
filters: { cutId?: number; residentialOnly?: boolean }
): Promise<ImportResult> {
const locationFile = `Location_${provinceCode}.csv`;
const filePath = path.join(NAR_DATA_DIR, locationFile);
const stream = require('fs').createReadStream(filePath);
const parser = stream.pipe(csvParser());
let batch: any[] = [];
const stats = { imported: 0, skipped: 0, errors: 0, total: 0 };
for await (const row of parser) {
stats.total++;
const locGuid = row.LOC_GUID;
const addresses = addressMap.get(locGuid);
if (!addresses || addresses.length === 0) {
stats.skipped++;
continue;
}
// Residential filter
if (filters.residentialOnly && parseInt(row.BU_USE) !== 1) {
stats.skipped++;
continue;
}
// Convert coordinates
const coords = this.getCoordinates(row);
if (!coords) {
stats.errors++;
continue;
}
// Cut filter (if specified)
if (filters.cutId) {
const cut = await prisma.cut.findUnique({ where: { id: filters.cutId } });
if (cut && !this.isPointInPolygon([coords.longitude, coords.latitude], cut.geojson)) {
stats.skipped++;
continue;
}
}
batch.push({ location: row, addresses, coords });
if (batch.length >= BATCH_SIZE) {
await this.importBatch(batch);
stats.imported += batch.length;
batch = [];
}
}
if (batch.length > 0) {
await this.importBatch(batch);
stats.imported += batch.length;
}
return stats;
}
private getCoordinates(row: NARLocationRecord): { latitude: number; longitude: number } | null {
// Try BG_X/BG_Y conversion
if (row.BG_X && row.BG_Y) {
try {
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [row.BG_X, row.BG_Y]);
if (lat >= 41 && lat <= 84 && lng >= -141 && lng <= -52) {
return { latitude: lat, longitude: lng };
}
} catch (error) {
logger.warn('Coordinate conversion failed:', error);
}
}
// Fallback to BG_LATITUDE/BG_LONGITUDE
if (row.BG_LATITUDE && row.BG_LONGITUDE) {
return { latitude: row.BG_LATITUDE, longitude: row.BG_LONGITUDE };
}
return null;
}
private async importBatch(batch: any[]): Promise<void> {
await prisma.$transaction(async (tx) => {
for (const item of batch) {
const location = await tx.location.upsert({
where: { locGuid: item.location.LOC_GUID },
update: {
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].POSTAL_CODE,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY
},
create: {
locGuid: item.location.LOC_GUID,
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].POSTAL_CODE,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR'
}
});
for (const addr of item.addresses) {
await tx.address.upsert({
where: { addrGuid: addr.ADDR_GUID },
update: {},
create: {
addrGuid: addr.ADDR_GUID,
locationId: location.id,
unitNumber: addr.CIVIC_NO
}
});
}
}
});
}
private formatAddress(addr: NARAddressRecord): string {
return `${addr.CIVIC_NO} ${addr.OFFICIAL_STREET_NAME}`.trim();
}
private isPointInPolygon(point: [number, number], geojson: any): boolean {
// Point-in-polygon implementation
// (Same as in spatial.ts)
return true; // Placeholder
}
}
Troubleshooting
Problem: No datasets found
Symptoms:
- GET /api/locations/nar/datasets returns empty array
- "No datasets available" message in admin
Solutions:
- Verify NAR_DATA_DIR path:
echo $NAR_DATA_DIR
ls -la /data
- Check Docker volume mount:
# docker-compose.yml
services:
api:
volumes:
- ./data:/data:ro
- Verify file naming convention:
# Correct:
Address_35_part_1.csv
Location_35.csv
# Incorrect:
address_35.csv # Lowercase
Addresses_35.csv # Plural
Address35.csv # No underscore
- Check file permissions:
chmod 644 /data/Address_*.csv
chmod 644 /data/Location_*.csv
Problem: Coordinate conversion errors
Symptoms:
- Many locations skipped during import
- "Converted coordinates outside Canada" warnings
- Null latitude/longitude in database
Solutions:
- Verify BG_X/BG_Y values:
// Valid range for Canada (EPSG:3347):
// BG_X: ~400,000 to 3,000,000
// BG_Y: ~4,600,000 to 9,000,000
console.log('BG_X:', narRecord.BG_X); // Should be 6-7 digits
console.log('BG_Y:', narRecord.BG_Y); // Should be 7 digits
- Test with known coordinates:
// Toronto City Hall
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [609091.8, 4834610.7]);
console.log('Expected: 43.6532, -79.3832');
console.log('Got:', lat, lng);
- Fallback to BG_LATITUDE/BG_LONGITUDE:
// If BG_X/BG_Y missing or invalid, use lat/lng directly
if (!coords && narRecord.BG_LATITUDE && narRecord.BG_LONGITUDE) {
coords = {
latitude: narRecord.BG_LATITUDE,
longitude: narRecord.BG_LONGITUDE
};
}
- Check proj4 definition:
npm list proj4
# Ensure version 2.8.0+
Problem: Import very slow (> 30min for 100k records)
Symptoms:
- Import hangs on large provinces
- Memory usage grows over time
- Database connection timeouts
Solutions:
- Increase batch size:
NAR_BATCH_SIZE=1000 # Default: 500
- Use streaming instead of loading all addresses:
// DON'T do this (loads all into memory):
const allAddresses = await readAllAddressFiles();
// DO this (stream and process incrementally):
for await (const addressBatch of streamAddressFiles()) {
processBatch(addressBatch);
}
- Optimize database indexes:
CREATE INDEX CONCURRENTLY idx_locations_loc_guid ON "Location"(locGuid);
CREATE INDEX CONCURRENTLY idx_addresses_addr_guid ON "Address"(addrGuid);
- Disable geocoding during import:
// Skip geocoding service since NAR already has coordinates
geocodeConfidence: 100,
geocodeProvider: 'NAR'
// No call to geocodingService.geocode()
- Use worker threads for parallel processing:
import { Worker } from 'worker_threads';
const workers = [];
for (let i = 0; i < 4; i++) {
const worker = new Worker('./nar-import-worker.js');
workers.push(worker);
}
Problem: Duplicate LOC_GUID errors
Symptoms:
- Unique constraint violation on locGuid
- Import fails mid-process
- "Duplicate key value violates unique constraint" error
Solutions:
- Use UPSERT instead of INSERT:
await prisma.location.upsert({
where: { locGuid: narRecord.LOC_GUID },
update: { /* update fields */ },
create: { /* create fields */ }
});
- Check for corrupt NAR files:
# Count unique LOC_GUIDs
cut -d, -f2 Address_35_part_1.csv | sort | uniq | wc -l
# Check for duplicates
cut -d, -f2 Address_35_part_1.csv | sort | uniq -d
- Clean up partial imports:
-- Delete locations from failed import
DELETE FROM "Location" WHERE "geocodeProvider" = 'NAR' AND "createdAt" > '2025-02-13';
- Implement transaction rollback on error:
try {
await prisma.$transaction(async (tx) => {
// Import batch
});
} catch (error) {
logger.error('Batch failed, rolling back:', error);
// Transaction automatically rolled back
}
Performance Considerations
Import Speed
Benchmarks:
| Province | Records | Files | Time | Records/Second |
|---|---|---|---|---|
| PEI (11) | 15,000 | 1 | 12s | 1,250 |
| Nova Scotia (12) | 85,000 | 1 | 1m 10s | 1,214 |
| Quebec (24) | 850,000 | 6 | 11m 20s | 1,250 |
| Ontario (35) | 1,200,000 | 3 | 14m 30s | 1,379 |
Factors:
- Batch size: 500 (optimal for most systems)
- Coordinate conversion: ~0.1ms per record
- Database write: ~0.5ms per location (depends on disk speed)
- Total overhead: ~0.7ms per record
Memory Usage
Peak Memory:
- Address map (in-memory): ~200MB per 100k records
- CSV parser buffer: ~10MB
- Batch buffer: ~5MB (500 records)
- Total: ~220MB per 100k records
Optimization:
- Stream address files instead of loading all
- Process location file in chunks
- Clear batch after each commit
- Limit concurrent transactions
Database Load
Transaction Rate:
- 1 transaction per batch (500 records)
- ~2-3 transactions/second
- Low database CPU (~10-20%)
- Moderate disk I/O (sequential writes)
Connection Pool:
// prisma/schema.prisma
datasource db {
url = env("DATABASE_URL")
connection_limit = 10
}
Related Documentation
Backend Documentation
-
NAR Import Service:
api/src/modules/map/locations/nar-import.service.ts- File scanning
- Streaming CSV parser
- Coordinate conversion
- Batch import
-
NAR Import Routes:
api/src/modules/map/locations/nar-import.routes.ts- Dataset discovery
- Import job creation
- Progress tracking
-
Locations Service:
api/src/modules/map/locations/locations.service.ts- Location CRUD
- Geocoding integration
Frontend Documentation
- Locations Page:
admin/src/pages/LocationsPage.tsx- NAR Import tab
- Dataset selection
- Filter configuration
- Progress monitoring
Database Documentation
-
Location Model:
api/prisma/schema.prisma- NAR-specific fields
- locGuid unique constraint
- Federal district index
-
Address Model:
api/prisma/schema.prisma- addrGuid unique constraint
- Location foreign key
External Resources
- Elections Canada NAR: https://www.elections.ca/content.aspx?section=res&dir=cir/tech/nar&document=index&lang=e
- EPSG:3347 Definition: https://epsg.io/3347
- Proj4 Documentation: https://github.com/proj4js/proj4js
- NAR Data Dictionary: Elections Canada NAR Technical Documentation (PDF)