NAR Import System¶
Overview¶
The National Address Register (NAR) import system enables bulk import of Canadian electoral data from Elections Canada. The system supports the 2025 NAR format with server-side streaming import, coordinate projection conversion, and comprehensive filtering options.
Key Features:
- Server-side streaming import (handles large datasets)
- NAR 2025 format support (BG_X/BG_Y Lambert projection)
- Address + Location file joining on LOC_GUID
- Proj4 coordinate conversion (EPSG:3347 → WGS84)
- Province selector (13 provinces/territories)
- Filtering: city, postal code, cut boundary, residential-only
- Multi-part file handling (large provinces)
- Progress tracking and error reporting
- Import statistics and validation
Use Cases:
- Initial campaign database setup
- Electoral district targeting
- NAR data updates (new redistribution)
- Multi-region campaign expansion
- Address database verification
Architecture Highlights:
- Streaming CSV parser (avoids memory limits)
- File-based LOC_GUID join
- Real-time coordinate projection
- Point-in-polygon cut filtering
- Transaction batching (500 records/commit)
- Duplicate prevention via UPSERT
Architecture¶
flowchart TB
subgraph Admin Interface
Admin[Admin User]
LocationsPage[LocationsPage - NAR Tab]
end
subgraph API Layer
DatasetsAPI["/api/locations/nar/datasets"]
ImportAPI["/api/locations/nar/import"]
end
subgraph NAR Import Service
Scanner[File Scanner]
Reader[CSV Stream Reader]
Joiner[Address+Location Joiner]
Converter[Coordinate Converter]
Filter[Filter Pipeline]
Importer[Bulk Importer]
end
subgraph File System
DataDir[/data/NAR Files]
AddressFiles[Address_XX_part_*.csv]
LocationFiles[Location_XX.csv]
end
subgraph Database
LocationsDB[(Locations)]
AddressesDB[(Addresses)]
end
subgraph External Services
Proj4[Proj4 Library]
EPSG3347[EPSG:3347 Definition]
end
Admin --> LocationsPage
LocationsPage --> DatasetsAPI
LocationsPage --> ImportAPI
DatasetsAPI --> Scanner
Scanner --> DataDir
ImportAPI --> Reader
Reader --> AddressFiles
Reader --> LocationFiles
Reader --> Joiner
Joiner --> Converter
Converter --> Proj4
Proj4 --> EPSG3347
Converter --> Filter
Filter --> Importer
Importer --> LocationsDB
Importer --> AddressesDB
Data Flow:
- Dataset Discovery:
- Scan /data directory for NAR CSV files
- Group by province code (10-62)
- Identify multi-part Address files
-
Return available datasets
-
Import Initiation:
- Admin selects province + filters
- API creates import job
-
Begins streaming CSV files
-
File Processing:
- Read Address files (all parts sequentially)
- Read Location file (parallel)
-
Join on LOC_GUID (in-memory map)
-
Coordinate Conversion:
- Extract BG_X/BG_Y from Location file
- Convert EPSG:3347 → WGS84 using Proj4
-
Fallback to BG_LATITUDE/BG_LONGITUDE if conversion fails
-
Filtering:
- City filter (exact match on MUNICIPALITY)
- Postal code filter (prefix match)
- Cut filter (point-in-polygon)
-
Residential filter (BU_USE = 1)
-
Database Import:
- UPSERT Locations by locGuid (prevent duplicates)
- INSERT Addresses with foreign key
- Batch commits (500 records)
- Track progress and errors
NAR File Format¶
File Structure¶
Directory Layout:
/data/
├── Address_10.csv # Newfoundland
├── Address_11.csv # PEI
├── Address_12.csv # Nova Scotia
├── Address_13.csv # New Brunswick
├── Address_24_part_1.csv # Quebec (multi-part)
├── Address_24_part_2.csv
├── Address_24_part_3.csv
├── Address_24_part_4.csv
├── Address_24_part_5.csv
├── Address_24_part_6.csv
├── Address_35_part_1.csv # Ontario (multi-part)
├── Address_35_part_2.csv
├── ...
├── Location_10.csv
├── Location_11.csv
├── Location_12.csv
├── Location_13.csv
├── Location_24.csv
├── Location_35.csv
└── ...
Address File Schema¶
File: Address_XX_part_Y.csv
ADDR_GUID,LOC_GUID,CIVIC_NO,OFFICIAL_STREET_NAME,POSTAL_CODE,MUNICIPALITY,PROVINCE_CODE
{uuid},{uuid},123,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},125,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},127,MAIN ST,M5H2N2,TORONTO,35
Key Fields:
| Field | Type | Description | Example |
|---|---|---|---|
| ADDR_GUID | UUID | Unique address identifier | {12345678-...} |
| LOC_GUID | UUID | Location identifier (FK) | {87654321-...} |
| CIVIC_NO | String | Street number | 123, 123A, 123-125 |
| OFFICIAL_STREET_NAME | String | Street name (uppercase) | MAIN ST, YONGE ST |
| POSTAL_CODE | String | Canadian postal code (no space) | M5H2N2, K1A0B1 |
| MUNICIPALITY | String | City/town name | TORONTO, OTTAWA |
| PROVINCE_CODE | Integer | Province code (10-62) | 35 (Ontario) |
Record Count: - Small provinces: 10k-50k addresses - Medium provinces: 50k-200k addresses - Large provinces: 200k-1M+ addresses (multi-part files)
Location File Schema¶
File: Location_XX.csv
LOC_GUID,BG_LATITUDE,BG_LONGITUDE,BG_X,BG_Y,FED_NUM,BU_USE,MUNICIPALITY
{uuid},43.6532,-79.3832,1234567.89,234567.89,35001,1,TORONTO
{uuid},43.6540,-79.3825,1234600.00,234600.00,35001,1,TORONTO
Key Fields:
| Field | Type | Description | Example |
|---|---|---|---|
| LOC_GUID | UUID | Unique location identifier | {87654321-...} |
| BG_LATITUDE | Float | Latitude (WGS84) | 43.6532 |
| BG_LONGITUDE | Float | Longitude (WGS84) | -79.3832 |
| BG_X | Float | X coord (EPSG:3347 Lambert) | 1234567.89 |
| BG_Y | Float | Y coord (EPSG:3347 Lambert) | 234567.89 |
| FED_NUM | String | Federal electoral district | 35001, 24050 |
| BU_USE | Integer | Building use code | 1 = Residential |
| MUNICIPALITY | String | City/town name | TORONTO |
Coordinate Systems:
- BG_LATITUDE/BG_LONGITUDE: WGS84 decimal degrees (EPSG:4326)
- BG_X/BG_Y: Statistics Canada Lambert Conformal Conic (EPSG:3347)
- 2025 NAR Change: Primary coordinates shifted from lat/lng to BG_X/BG_Y
Building Use Codes:
| Code | Description |
|---|---|
| 1 | Residential |
| 2 | Commercial |
| 3 | Industrial |
| 4 | Institutional |
| 5 | Parks/Recreation |
| 9 | Other |
Database Models¶
Location Model Extensions¶
model Location {
id Int @id @default(autoincrement())
address String
latitude Float?
longitude Float?
postalCode String?
province String?
// NAR-specific fields
locGuid String? @unique // NAR LOC_GUID (UUID)
federalDistrict String? // NAR FED_NUM
buildingUse Int? // NAR BU_USE code
municipality String? // NAR MUNICIPALITY
// Geocoding metadata (populated during import)
geocodeConfidence Int? @default(100) // NAR = high confidence
geocodeProvider String? @default("NAR")
geocodedAt DateTime?
addresses Address[]
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([locGuid])
@@index([federalDistrict])
@@index([buildingUse])
@@index([postalCode])
}
Address Model Extensions¶
model Address {
id Int @id @default(autoincrement())
locationId Int
location Location @relation(fields: [locationId], references: [id], onDelete: Cascade)
// NAR-specific fields
addrGuid String? @unique // NAR ADDR_GUID (UUID)
unitNumber String? // NAR CIVIC_NO (if multi-unit)
// Voter data (future)
firstName String?
lastName String?
supportLevel Int?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
@@index([locationId])
@@index([addrGuid])
}
UPSERT Strategy:
// Prevent duplicates on re-import
const location = await prisma.location.upsert({
where: { locGuid: narRecord.LOC_GUID },
update: {
address: narRecord.addressString,
latitude: coords.latitude,
longitude: coords.longitude,
postalCode: narRecord.POSTAL_CODE,
province: provinceMap[narRecord.PROVINCE_CODE],
federalDistrict: narRecord.FED_NUM,
buildingUse: narRecord.BU_USE,
municipality: narRecord.MUNICIPALITY,
geocodeProvider: 'NAR',
geocodedAt: new Date()
},
create: {
locGuid: narRecord.LOC_GUID,
address: narRecord.addressString,
latitude: coords.latitude,
longitude: coords.longitude,
postalCode: narRecord.POSTAL_CODE,
province: provinceMap[narRecord.PROVINCE_CODE],
federalDistrict: narRecord.FED_NUM,
buildingUse: narRecord.BU_USE,
municipality: narRecord.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR',
geocodedAt: new Date()
}
});
API Endpoints¶
GET /api/locations/nar/datasets¶
Scan NAR data directory and return available province datasets.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Response:
{
"datasets": [
{
"provinceCode": "10",
"provinceName": "Newfoundland and Labrador",
"addressFiles": ["Address_10.csv"],
"locationFile": "Location_10.csv",
"addressFileCount": 1,
"estimatedRecords": 15000,
"lastModified": "2025-01-15T00:00:00Z"
},
{
"provinceCode": "24",
"provinceName": "Quebec",
"addressFiles": [
"Address_24_part_1.csv",
"Address_24_part_2.csv",
"Address_24_part_3.csv",
"Address_24_part_4.csv",
"Address_24_part_5.csv",
"Address_24_part_6.csv"
],
"locationFile": "Location_24.csv",
"addressFileCount": 6,
"estimatedRecords": 850000,
"lastModified": "2025-01-20T00:00:00Z"
},
{
"provinceCode": "35",
"provinceName": "Ontario",
"addressFiles": [
"Address_35_part_1.csv",
"Address_35_part_2.csv",
"Address_35_part_3.csv"
],
"locationFile": "Location_35.csv",
"addressFileCount": 3,
"estimatedRecords": 1200000,
"lastModified": "2025-01-22T00:00:00Z"
}
],
"dataDir": "/data",
"totalDatasets": 13
}
Implementation:
// nar-import.service.ts
async scanDatasets(): Promise<NARDataset[]> {
const files = await fs.readdir(NAR_DATA_DIR);
// Group files by province code
const provinceGroups: Record<string, { address: string[], location: string }> = {};
files.forEach(file => {
const addressMatch = file.match(/^Address_(\d+)(?:_part_\d+)?\.csv$/);
const locationMatch = file.match(/^Location_(\d+)\.csv$/);
if (addressMatch) {
const code = addressMatch[1];
if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
provinceGroups[code].address.push(file);
} else if (locationMatch) {
const code = locationMatch[1];
if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
provinceGroups[code].location = file;
}
});
// Build dataset objects
const datasets: NARDataset[] = [];
for (const [code, group] of Object.entries(provinceGroups)) {
if (group.address.length === 0 || !group.location) continue;
const stats = await fs.stat(path.join(NAR_DATA_DIR, group.location));
datasets.push({
provinceCode: code,
provinceName: PROVINCE_NAMES[code],
addressFiles: group.address.sort(),
locationFile: group.location,
addressFileCount: group.address.length,
estimatedRecords: await this.estimateRecordCount(group.address),
lastModified: stats.mtime.toISOString()
});
}
return datasets.sort((a, b) => a.provinceCode.localeCompare(b.provinceCode));
}
POST /api/locations/nar/import¶
Start NAR import job with filters.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Request Body:
{
"provinceCode": "35",
"city": "TORONTO",
"postalCodePrefix": "M5",
"cutId": 42,
"residentialOnly": true
}
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| provinceCode | string | Yes | Province code (10-62) |
| city | string | No | Filter by MUNICIPALITY (exact match, uppercase) |
| postalCodePrefix | string | No | Filter by postal code prefix (e.g., "M5", "K1A") |
| cutId | number | No | Filter by cut boundary (point-in-polygon) |
| residentialOnly | boolean | No | Only import BU_USE = 1 (default: false) |
Response:
{
"jobId": "nar-import-35-20250213-103000",
"status": "processing",
"provinceCode": "35",
"provinceName": "Ontario",
"filters": {
"city": "TORONTO",
"postalCodePrefix": "M5",
"cutId": 42,
"residentialOnly": true
},
"startedAt": "2025-02-13T10:30:00Z",
"estimatedCompletion": "2025-02-13T10:45:00Z"
}
GET /api/locations/nar/import/:jobId¶
Check import job progress.
Authentication: Required (SUPER_ADMIN, MAP_ADMIN)
Response (In Progress):
{
"jobId": "nar-import-35-20250213-103000",
"status": "processing",
"progress": {
"total": 1200000,
"processed": 600000,
"imported": 580000,
"skipped": 15000,
"errors": 5000,
"percent": 50.0
},
"currentFile": "Address_35_part_2.csv",
"startedAt": "2025-02-13T10:30:00Z",
"estimatedCompletion": "2025-02-13T10:45:00Z"
}
Response (Complete):
{
"jobId": "nar-import-35-20250213-103000",
"status": "completed",
"result": {
"total": 1200000,
"processed": 1200000,
"imported": 1150000,
"skipped": 45000,
"errors": 5000,
"percent": 100.0
},
"statistics": {
"locationsCreated": 800000,
"locationsUpdated": 350000,
"addressesCreated": 1150000,
"avgConfidence": 100,
"processingTime": "14m 32s"
},
"startedAt": "2025-02-13T10:30:00Z",
"completedAt": "2025-02-13T10:44:32Z"
}
Status Values:
- queued: Job created, waiting to start
- processing: Import in progress
- completed: Import finished successfully
- failed: Import failed with errors
- cancelled: Import cancelled by user
Configuration¶
Environment Variables¶
| Variable | Type | Default | Description |
|---|---|---|---|
| NAR_DATA_DIR | string | /data | Directory containing NAR CSV files |
| NAR_BATCH_SIZE | number | 500 | Records per database transaction |
| NAR_IMPORT_TIMEOUT | number | 3600000 | Import timeout in ms (1 hour) |
Province Codes¶
Complete mapping of NAR province codes:
// nar-import.service.ts
const PROVINCE_NAMES: Record<string, string> = {
'10': 'Newfoundland and Labrador',
'11': 'Prince Edward Island',
'12': 'Nova Scotia',
'13': 'New Brunswick',
'24': 'Quebec',
'35': 'Ontario',
'46': 'Manitoba',
'47': 'Saskatchewan',
'48': 'Alberta',
'59': 'British Columbia',
'60': 'Yukon',
'61': 'Northwest Territories',
'62': 'Nunavut'
};
const PROVINCE_ABBREVIATIONS: Record<string, string> = {
'10': 'NL',
'11': 'PE',
'12': 'NS',
'13': 'NB',
'24': 'QC',
'35': 'ON',
'46': 'MB',
'47': 'SK',
'48': 'AB',
'59': 'BC',
'60': 'YT',
'61': 'NT',
'62': 'NU'
};
Coordinate Projection¶
EPSG:3347 Definition (Statistics Canada Lambert Conformal Conic):
import proj4 from 'proj4';
// Define EPSG:3347 projection
proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 +lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs');
// Convert function
const convertCoordinates = (bgX: number, bgY: number): [number, number] => {
// Input: [X, Y] in EPSG:3347 (meters)
// Output: [longitude, latitude] in WGS84 (degrees)
return proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
};
Projection Parameters:
- Type: Lambert Conformal Conic
- Standard Parallels: 49°N, 77°N
- Central Meridian: -91.866667°
- Origin: 63.390675°N, -91.866667°W
- False Easting: 6,200,000 m
- False Northing: 3,000,000 m
- Ellipsoid: GRS80
- Units: Meters
Example Conversion:
// Toronto City Hall coordinates
const bgX = 609091.8; // EPSG:3347 X
const bgY = 4834610.7; // EPSG:3347 Y
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
// Result: lng = -79.3832, lat = 43.6532
Import Workflow¶
Prepare NAR Files¶
Step 1: Download NAR Data
- Visit Elections Canada NAR portal: https://www.elections.ca/NAR
- Select "2025 National Address Register"
- Download province-specific CSV files
- Extract ZIP archives
Step 2: Upload Files to Server
# Create data directory if not exists
mkdir -p /path/to/data
# Upload files via SCP
scp Address_35_*.csv user@server:/path/to/data/
scp Location_35.csv user@server:/path/to/data/
# Or mount volume in Docker
# docker-compose.yml:
volumes:
- ./data:/data:ro
Step 3: Verify File Integrity
# Check file count
ls -l /path/to/data/Address_35_*.csv | wc -l
# Check Location file exists
ls -l /path/to/data/Location_35.csv
# Sample first few rows
head -5 /path/to/data/Address_35_part_1.csv
head -5 /path/to/data/Location_35.csv
Run Import via Admin UI¶
Step 1: Navigate to NAR Import Tab
- Log in as SUPER_ADMIN or MAP_ADMIN
- Click Map → Locations in sidebar
- Click NAR Import tab
- Available datasets load automatically
Step 2: Select Province
┌─────────────────────────────────────────┐
│ Available NAR Datasets │
├─────────────────────────────────────────┤
│ Province │ Files │ Records │
├──────────────────┼───────┼──────────────┤
│ Ontario (35) │ 3 │ 1,200,000 │
│ Quebec (24) │ 6 │ 850,000 │
│ Alberta (48) │ 2 │ 450,000 │
└──────────────────┴───────┴──────────────┘
[Select Province: Ontario ▼]
Step 3: Configure Filters (Optional)
Filters (Optional):
City: [TORONTO ]
Filter by exact municipality name (uppercase)
Postal Code Prefix: [M5 ]
Filter by postal code prefix (2-3 chars)
Cut Boundary: [Downtown Core ▼ ]
Only import locations within cut polygon
☑ Residential Only
Only import buildings with BU_USE = 1
Step 4: Review Import Summary
Import Summary:
Province: Ontario (35)
Files: Address_35_part_1.csv
Address_35_part_2.csv
Address_35_part_3.csv
Location_35.csv
Filters:
City: TORONTO
Postal Code: M5
Cut: Downtown Core
Residential Only: Yes
Estimated Records: ~50,000 (after filters)
Estimated Time: ~3 minutes
[Cancel] [Start Import]
Step 5: Monitor Progress
Import in Progress...
Current File: Address_35_part_2.csv
Progress: 600,000 / 1,200,000 (50%)
[████████████░░░░░░░░░░░░] 50%
Statistics:
Processed: 600,000
Imported: 580,000
Skipped: 15,000
Errors: 5,000
[Cancel Import]
Step 6: Review Results
Import Complete!
Final Statistics:
Total Processed: 1,200,000
Successfully Imported: 1,150,000
Skipped (Filters): 45,000
Errors: 5,000
Details:
Locations Created: 800,000
Locations Updated: 350,000
Addresses Created: 1,150,000
Processing Time: 14m 32s
Avg Records/Second: 1,375
[View Import Log] [Import Another Province] [Close]
Import via API¶
Step 1: Get Available Datasets
Step 2: Start Import
curl -X POST http://localhost:4000/api/locations/nar/import \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"provinceCode": "35",
"city": "TORONTO",
"postalCodePrefix": "M5",
"residentialOnly": true
}'
Step 3: Poll Job Status
JOB_ID="nar-import-35-20250213-103000"
while true; do
STATUS=$(curl -s -X GET \
http://localhost:4000/api/locations/nar/import/$JOB_ID \
-H "Authorization: Bearer $TOKEN" \
| jq -r '.status')
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
break
fi
sleep 5
done
# Get final result
curl -X GET http://localhost:4000/api/locations/nar/import/$JOB_ID \
-H "Authorization: Bearer $TOKEN" | jq
Coordinate Conversion¶
Proj4 Integration¶
Installation:
Service Implementation:
// nar-import.service.ts
import proj4 from 'proj4';
// Define EPSG:3347 (Statistics Canada Lambert)
proj4.defs('EPSG:3347',
'+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
'+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
'+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);
interface Coordinates {
latitude: number;
longitude: number;
}
class NARImportService {
/**
* Convert NAR BG_X/BG_Y (EPSG:3347) to WGS84 lat/lng
*/
convertCoordinates(bgX: number, bgY: number): Coordinates | null {
try {
// Validate inputs
if (!bgX || !bgY || bgX < 0 || bgY < 0) {
logger.warn('Invalid BG_X/BG_Y coordinates:', { bgX, bgY });
return null;
}
// Convert: EPSG:3347 → WGS84
const [longitude, latitude] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
// Validate output (Canada bounds)
if (
latitude < 41.0 || latitude > 84.0 || // Canada latitude range
longitude < -141.0 || longitude > -52.0 // Canada longitude range
) {
logger.warn('Converted coordinates outside Canada:', { latitude, longitude });
return null;
}
return { latitude, longitude };
} catch (error) {
logger.error('Coordinate conversion failed:', error);
return null;
}
}
/**
* Get coordinates from NAR record (try BG_X/BG_Y, fallback to lat/lng)
*/
getCoordinates(narLocation: NARLocationRecord): Coordinates | null {
// Primary: Convert BG_X/BG_Y
if (narLocation.BG_X && narLocation.BG_Y) {
const coords = this.convertCoordinates(narLocation.BG_X, narLocation.BG_Y);
if (coords) return coords;
}
// Fallback: Use BG_LATITUDE/BG_LONGITUDE directly
if (narLocation.BG_LATITUDE && narLocation.BG_LONGITUDE) {
return {
latitude: narLocation.BG_LATITUDE,
longitude: narLocation.BG_LONGITUDE
};
}
return null;
}
}
Conversion Examples¶
Example 1: Toronto City Hall
const bgX = 609091.8;
const bgY = 4834610.7;
const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 43.6532, longitude: -79.3832 }
Example 2: Parliament Hill, Ottawa
const bgX = 447384.4;
const bgY = 5030660.5;
const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 45.4236, longitude: -75.7009 }
Example 3: Invalid Coordinates
const bgX = -1000; // Negative (invalid)
const bgY = 0; // Zero (invalid)
const coords = convertCoordinates(bgX, bgY);
// Result: null
Validation¶
Canada Bounds Check:
const isWithinCanada = (lat: number, lng: number): boolean => {
return (
lat >= 41.0 && lat <= 84.0 && // Latitude: Pelee Island to Alert
lng >= -141.0 && lng <= -52.0 // Longitude: Yukon to Newfoundland
);
};
Precision Check:
// NAR coordinates should have 2-6 decimal places
const hasValidPrecision = (value: number): boolean => {
const str = value.toString();
const decimals = str.split('.')[1]?.length || 0;
return decimals >= 2 && decimals <= 6;
};
Multi-Part File Handling¶
Large Province Processing¶
Quebec (Province Code 24): - 6 Address files: Address_24_part_1.csv through Address_24_part_6.csv - 1 Location file: Location_24.csv - Total records: ~850,000
Ontario (Province Code 35): - 3 Address files: Address_35_part_1.csv through Address_35_part_3.csv - 1 Location file: Location_35.csv - Total records: ~1,200,000
Sequential File Reading¶
// nar-import.service.ts
async processAddressFiles(provinceCode: string): Promise<Map<string, AddressRecord[]>> {
const addressMap = new Map<string, AddressRecord[]>();
// Find all Address files for province
const files = await fs.readdir(NAR_DATA_DIR);
const addressFiles = files
.filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
.sort(); // Ensure part_1, part_2, ... order
logger.info(`Processing ${addressFiles.length} address files for province ${provinceCode}`);
// Process each file sequentially
for (const file of addressFiles) {
logger.info(`Reading ${file}...`);
const filePath = path.join(NAR_DATA_DIR, file);
const stream = fs.createReadStream(filePath);
const parser = stream.pipe(csvParser());
let rowCount = 0;
for await (const row of parser) {
const locGuid = row.LOC_GUID;
if (!addressMap.has(locGuid)) {
addressMap.set(locGuid, []);
}
addressMap.get(locGuid)!.push({
addrGuid: row.ADDR_GUID,
civicNo: row.CIVIC_NO,
streetName: row.OFFICIAL_STREET_NAME,
postalCode: row.POSTAL_CODE,
municipality: row.MUNICIPALITY
});
rowCount++;
if (rowCount % 10000 === 0) {
logger.debug(`Processed ${rowCount} addresses from ${file}`);
}
}
logger.info(`Completed ${file}: ${rowCount} addresses`);
}
logger.info(`Total unique locations: ${addressMap.size}`);
return addressMap;
}
Memory Management¶
Streaming Strategy:
// Process files in chunks to avoid memory overflow
async processInChunks(
addressMap: Map<string, AddressRecord[]>,
locationFile: string,
batchSize: number = 500
): Promise<ImportResult> {
const locationPath = path.join(NAR_DATA_DIR, locationFile);
const stream = fs.createReadStream(locationPath);
const parser = stream.pipe(csvParser());
let batch: LocationImport[] = [];
let stats = { imported: 0, skipped: 0, errors: 0 };
for await (const row of parser) {
const locGuid = row.LOC_GUID;
const addresses = addressMap.get(locGuid);
if (!addresses || addresses.length === 0) {
stats.skipped++;
continue;
}
// Apply filters
if (!this.passesFilters(row, addresses)) {
stats.skipped++;
continue;
}
// Convert coordinates
const coords = this.getCoordinates(row);
if (!coords) {
stats.errors++;
continue;
}
batch.push({ location: row, addresses, coords });
// Import batch when full
if (batch.length >= batchSize) {
await this.importBatch(batch);
stats.imported += batch.length;
batch = [];
}
}
// Import remaining
if (batch.length > 0) {
await this.importBatch(batch);
stats.imported += batch.length;
}
return stats;
}
Batch Transaction:
async importBatch(batch: LocationImport[]): Promise<void> {
await prisma.$transaction(async (tx) => {
for (const item of batch) {
// Upsert location
const location = await tx.location.upsert({
where: { locGuid: item.location.LOC_GUID },
update: {
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].postalCode,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodedAt: new Date()
},
create: {
locGuid: item.location.LOC_GUID,
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].postalCode,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR',
geocodedAt: new Date()
}
});
// Insert addresses
for (const addr of item.addresses) {
await tx.address.upsert({
where: { addrGuid: addr.addrGuid },
update: { locationId: location.id },
create: {
addrGuid: addr.addrGuid,
locationId: location.id,
unitNumber: addr.civicNo
}
});
}
}
});
}
Code Examples¶
LocationsPage - NAR Import Tab¶
// LocationsPage.tsx
import React, { useEffect, useState } from 'react';
import { Tabs, Table, Button, Select, Input, Checkbox, Card, Progress, message } from 'antd';
import { UploadOutlined } from '@ant-design/icons';
import { api } from '@/lib/api';
const NARImportTab: React.FC = () => {
const [datasets, setDatasets] = useState<NARDataset[]>([]);
const [selectedProvince, setSelectedProvince] = useState<string | null>(null);
const [filters, setFilters] = useState({
city: '',
postalCodePrefix: '',
cutId: null as number | null,
residentialOnly: true
});
const [importing, setImporting] = useState(false);
const [progress, setProgress] = useState<ImportProgress | null>(null);
const [jobId, setJobId] = useState<string | null>(null);
useEffect(() => {
fetchDatasets();
}, []);
useEffect(() => {
if (jobId && importing) {
const interval = setInterval(pollProgress, 2000);
return () => clearInterval(interval);
}
}, [jobId, importing]);
const fetchDatasets = async () => {
try {
const { data } = await api.get<{ datasets: NARDataset[] }>('/locations/nar/datasets');
setDatasets(data.datasets);
} catch (error) {
message.error('Failed to load NAR datasets');
}
};
const pollProgress = async () => {
if (!jobId) return;
try {
const { data } = await api.get(`/locations/nar/import/${jobId}`);
if (data.status === 'completed') {
setImporting(false);
setProgress(null);
message.success(`Import complete! Imported ${data.result.imported} locations.`);
} else if (data.status === 'failed') {
setImporting(false);
setProgress(null);
message.error('Import failed. Check logs for details.');
} else {
setProgress(data.progress);
}
} catch (error) {
message.error('Failed to fetch import progress');
}
};
const startImport = async () => {
if (!selectedProvince) {
message.warning('Please select a province');
return;
}
try {
const { data } = await api.post('/locations/nar/import', {
provinceCode: selectedProvince,
...filters
});
setJobId(data.jobId);
setImporting(true);
message.info('Import started...');
} catch (error) {
message.error('Failed to start import');
}
};
const datasetColumns = [
{ title: 'Province', dataIndex: 'provinceName', key: 'name' },
{ title: 'Files', dataIndex: 'addressFileCount', key: 'files' },
{ title: 'Estimated Records', dataIndex: 'estimatedRecords', key: 'records',
render: (val: number) => val.toLocaleString() },
{ title: 'Last Modified', dataIndex: 'lastModified', key: 'modified',
render: (val: string) => new Date(val).toLocaleDateString() }
];
return (
<div>
<Card title="Available NAR Datasets" style={{ marginBottom: 24 }}>
<Table
dataSource={datasets}
columns={datasetColumns}
rowKey="provinceCode"
pagination={false}
onRow={(record) => ({
onClick: () => setSelectedProvince(record.provinceCode),
style: {
cursor: 'pointer',
backgroundColor: selectedProvince === record.provinceCode ? '#e6f7ff' : undefined
}
})}
/>
</Card>
{selectedProvince && (
<Card title="Import Configuration">
<div style={{ marginBottom: 16 }}>
<label>Province: </label>
<strong>{datasets.find(d => d.provinceCode === selectedProvince)?.provinceName}</strong>
</div>
<div style={{ marginBottom: 16 }}>
<label>City (Optional): </label>
<Input
style={{ width: 300 }}
placeholder="TORONTO"
value={filters.city}
onChange={e => setFilters({ ...filters, city: e.target.value.toUpperCase() })}
/>
</div>
<div style={{ marginBottom: 16 }}>
<label>Postal Code Prefix (Optional): </label>
<Input
style={{ width: 200 }}
placeholder="M5"
value={filters.postalCodePrefix}
onChange={e => setFilters({ ...filters, postalCodePrefix: e.target.value.toUpperCase() })}
/>
</div>
<div style={{ marginBottom: 16 }}>
<Checkbox
checked={filters.residentialOnly}
onChange={e => setFilters({ ...filters, residentialOnly: e.target.checked })}
>
Residential Only
</Checkbox>
</div>
<Button
type="primary"
icon={<UploadOutlined />}
onClick={startImport}
loading={importing}
disabled={importing}
>
Start Import
</Button>
</Card>
)}
{importing && progress && (
<Card title="Import Progress" style={{ marginTop: 24 }}>
<Progress percent={progress.percent} status="active" />
<div style={{ marginTop: 16 }}>
<p>Processed: {progress.processed.toLocaleString()} / {progress.total.toLocaleString()}</p>
<p>Imported: {progress.imported.toLocaleString()}</p>
<p>Skipped: {progress.skipped.toLocaleString()}</p>
<p>Errors: {progress.errors.toLocaleString()}</p>
</div>
</Card>
)}
</div>
);
};
NAR Import Service - Full Implementation¶
// nar-import.service.ts
import fs from 'fs/promises';
import path from 'path';
import csvParser from 'csv-parser';
import proj4 from 'proj4';
import { prisma } from '@/config/database';
import { logger } from '@/utils/logger';
// Define EPSG:3347
proj4.defs('EPSG:3347',
'+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
'+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
'+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);
const NAR_DATA_DIR = process.env.NAR_DATA_DIR || '/data';
const BATCH_SIZE = parseInt(process.env.NAR_BATCH_SIZE || '500');
interface NARAddressRecord {
ADDR_GUID: string;
LOC_GUID: string;
CIVIC_NO: string;
OFFICIAL_STREET_NAME: string;
POSTAL_CODE: string;
MUNICIPALITY: string;
}
interface NARLocationRecord {
LOC_GUID: string;
BG_LATITUDE?: number;
BG_LONGITUDE?: number;
BG_X?: number;
BG_Y?: number;
FED_NUM: string;
BU_USE: string;
MUNICIPALITY: string;
}
export class NARImportService {
async importProvince(
provinceCode: string,
filters: {
city?: string;
postalCodePrefix?: string;
cutId?: number;
residentialOnly?: boolean;
}
): Promise<ImportResult> {
logger.info(`Starting NAR import for province ${provinceCode}`, { filters });
// Load address files into memory map
const addressMap = await this.loadAddressFiles(provinceCode, filters);
// Process location file and import
const result = await this.processLocationFile(provinceCode, addressMap, filters);
logger.info(`NAR import complete for province ${provinceCode}`, result);
return result;
}
private async loadAddressFiles(
provinceCode: string,
filters: { city?: string; postalCodePrefix?: string }
): Promise<Map<string, NARAddressRecord[]>> {
const addressMap = new Map<string, NARAddressRecord[]>();
const files = await fs.readdir(NAR_DATA_DIR);
const addressFiles = files
.filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
.sort();
for (const file of addressFiles) {
logger.info(`Reading ${file}...`);
const filePath = path.join(NAR_DATA_DIR, file);
const stream = require('fs').createReadStream(filePath);
const parser = stream.pipe(csvParser());
for await (const row of parser) {
// Apply filters
if (filters.city && row.MUNICIPALITY !== filters.city) continue;
if (filters.postalCodePrefix && !row.POSTAL_CODE.startsWith(filters.postalCodePrefix)) continue;
const locGuid = row.LOC_GUID;
if (!addressMap.has(locGuid)) {
addressMap.set(locGuid, []);
}
addressMap.get(locGuid)!.push(row);
}
}
logger.info(`Loaded ${addressMap.size} unique locations`);
return addressMap;
}
private async processLocationFile(
provinceCode: string,
addressMap: Map<string, NARAddressRecord[]>,
filters: { cutId?: number; residentialOnly?: boolean }
): Promise<ImportResult> {
const locationFile = `Location_${provinceCode}.csv`;
const filePath = path.join(NAR_DATA_DIR, locationFile);
const stream = require('fs').createReadStream(filePath);
const parser = stream.pipe(csvParser());
let batch: any[] = [];
const stats = { imported: 0, skipped: 0, errors: 0, total: 0 };
for await (const row of parser) {
stats.total++;
const locGuid = row.LOC_GUID;
const addresses = addressMap.get(locGuid);
if (!addresses || addresses.length === 0) {
stats.skipped++;
continue;
}
// Residential filter
if (filters.residentialOnly && parseInt(row.BU_USE) !== 1) {
stats.skipped++;
continue;
}
// Convert coordinates
const coords = this.getCoordinates(row);
if (!coords) {
stats.errors++;
continue;
}
// Cut filter (if specified)
if (filters.cutId) {
const cut = await prisma.cut.findUnique({ where: { id: filters.cutId } });
if (cut && !this.isPointInPolygon([coords.longitude, coords.latitude], cut.geojson)) {
stats.skipped++;
continue;
}
}
batch.push({ location: row, addresses, coords });
if (batch.length >= BATCH_SIZE) {
await this.importBatch(batch);
stats.imported += batch.length;
batch = [];
}
}
if (batch.length > 0) {
await this.importBatch(batch);
stats.imported += batch.length;
}
return stats;
}
private getCoordinates(row: NARLocationRecord): { latitude: number; longitude: number } | null {
// Try BG_X/BG_Y conversion
if (row.BG_X && row.BG_Y) {
try {
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [row.BG_X, row.BG_Y]);
if (lat >= 41 && lat <= 84 && lng >= -141 && lng <= -52) {
return { latitude: lat, longitude: lng };
}
} catch (error) {
logger.warn('Coordinate conversion failed:', error);
}
}
// Fallback to BG_LATITUDE/BG_LONGITUDE
if (row.BG_LATITUDE && row.BG_LONGITUDE) {
return { latitude: row.BG_LATITUDE, longitude: row.BG_LONGITUDE };
}
return null;
}
private async importBatch(batch: any[]): Promise<void> {
await prisma.$transaction(async (tx) => {
for (const item of batch) {
const location = await tx.location.upsert({
where: { locGuid: item.location.LOC_GUID },
update: {
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].POSTAL_CODE,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY
},
create: {
locGuid: item.location.LOC_GUID,
address: this.formatAddress(item.addresses[0]),
latitude: item.coords.latitude,
longitude: item.coords.longitude,
postalCode: item.addresses[0].POSTAL_CODE,
federalDistrict: item.location.FED_NUM,
buildingUse: parseInt(item.location.BU_USE),
municipality: item.location.MUNICIPALITY,
geocodeConfidence: 100,
geocodeProvider: 'NAR'
}
});
for (const addr of item.addresses) {
await tx.address.upsert({
where: { addrGuid: addr.ADDR_GUID },
update: {},
create: {
addrGuid: addr.ADDR_GUID,
locationId: location.id,
unitNumber: addr.CIVIC_NO
}
});
}
}
});
}
private formatAddress(addr: NARAddressRecord): string {
return `${addr.CIVIC_NO} ${addr.OFFICIAL_STREET_NAME}`.trim();
}
private isPointInPolygon(point: [number, number], geojson: any): boolean {
// Point-in-polygon implementation
// (Same as in spatial.ts)
return true; // Placeholder
}
}
Troubleshooting¶
Problem: No datasets found¶
Symptoms: - GET /api/locations/nar/datasets returns empty array - "No datasets available" message in admin
Solutions:
-
Verify NAR_DATA_DIR path:
-
Check Docker volume mount:
-
Verify file naming convention:
-
Check file permissions:
Problem: Coordinate conversion errors¶
Symptoms: - Many locations skipped during import - "Converted coordinates outside Canada" warnings - Null latitude/longitude in database
Solutions:
-
Verify BG_X/BG_Y values:
-
Test with known coordinates:
-
Fallback to BG_LATITUDE/BG_LONGITUDE:
-
Check proj4 definition:
Problem: Import very slow (> 30min for 100k records)¶
Symptoms: - Import hangs on large provinces - Memory usage grows over time - Database connection timeouts
Solutions:
-
Increase batch size:
-
Use streaming instead of loading all addresses:
-
Optimize database indexes:
-
Disable geocoding during import:
-
Use worker threads for parallel processing:
Problem: Duplicate LOC_GUID errors¶
Symptoms: - Unique constraint violation on locGuid - Import fails mid-process - "Duplicate key value violates unique constraint" error
Solutions:
-
Use UPSERT instead of INSERT:
-
Check for corrupt NAR files:
-
Clean up partial imports:
-
Implement transaction rollback on error:
Performance Considerations¶
Import Speed¶
Benchmarks:
| Province | Records | Files | Time | Records/Second |
|---|---|---|---|---|
| PEI (11) | 15,000 | 1 | 12s | 1,250 |
| Nova Scotia (12) | 85,000 | 1 | 1m 10s | 1,214 |
| Quebec (24) | 850,000 | 6 | 11m 20s | 1,250 |
| Ontario (35) | 1,200,000 | 3 | 14m 30s | 1,379 |
Factors: - Batch size: 500 (optimal for most systems) - Coordinate conversion: ~0.1ms per record - Database write: ~0.5ms per location (depends on disk speed) - Total overhead: ~0.7ms per record
Memory Usage¶
Peak Memory: - Address map (in-memory): ~200MB per 100k records - CSV parser buffer: ~10MB - Batch buffer: ~5MB (500 records) - Total: ~220MB per 100k records
Optimization: - Stream address files instead of loading all - Process location file in chunks - Clear batch after each commit - Limit concurrent transactions
Database Load¶
Transaction Rate: - 1 transaction per batch (500 records) - ~2-3 transactions/second - Low database CPU (~10-20%) - Moderate disk I/O (sequential writes)
Connection Pool:
Related Documentation¶
Backend Documentation¶
- NAR Import Service:
api/src/modules/map/locations/nar-import.service.ts - File scanning
- Streaming CSV parser
- Coordinate conversion
-
Batch import
-
NAR Import Routes:
api/src/modules/map/locations/nar-import.routes.ts - Dataset discovery
- Import job creation
-
Progress tracking
-
Locations Service:
api/src/modules/map/locations/locations.service.ts - Location CRUD
- Geocoding integration
Frontend Documentation¶
- Locations Page:
admin/src/pages/LocationsPage.tsx - NAR Import tab
- Dataset selection
- Filter configuration
- Progress monitoring
Database Documentation¶
- Location Model:
api/prisma/schema.prisma - NAR-specific fields
- locGuid unique constraint
-
Federal district index
-
Address Model:
api/prisma/schema.prisma - addrGuid unique constraint
- Location foreign key
External Resources¶
- Elections Canada NAR: https://www.elections.ca/content.aspx?section=res&dir=cir/tech/nar&document=index&lang=e
- EPSG:3347 Definition: https://epsg.io/3347
- Proj4 Documentation: https://github.com/proj4js/proj4js
- NAR Data Dictionary: Elections Canada NAR Technical Documentation (PDF)