47 KiB

NAR Import System

Overview

The National Address Register (NAR) import system enables bulk import of Canadian electoral data from Elections Canada. The system supports the 2025 NAR format with server-side streaming import, coordinate projection conversion, and comprehensive filtering options.

Key Features:

  • Server-side streaming import (handles large datasets)
  • NAR 2025 format support (BG_X/BG_Y Lambert projection)
  • Address + Location file joining on LOC_GUID
  • Proj4 coordinate conversion (EPSG:3347 → WGS84)
  • Province selector (13 provinces/territories)
  • Filtering: city, postal code, cut boundary, residential-only
  • Multi-part file handling (large provinces)
  • Progress tracking and error reporting
  • Import statistics and validation

Use Cases:

  • Initial campaign database setup
  • Electoral district targeting
  • NAR data updates (new redistribution)
  • Multi-region campaign expansion
  • Address database verification

Architecture Highlights:

  • Streaming CSV parser (avoids memory limits)
  • File-based LOC_GUID join
  • Real-time coordinate projection
  • Point-in-polygon cut filtering
  • Transaction batching (500 records/commit)
  • Duplicate prevention via UPSERT

Architecture

flowchart TB
    subgraph Admin Interface
        Admin[Admin User]
        LocationsPage[LocationsPage - NAR Tab]
    end

    subgraph API Layer
        DatasetsAPI["/api/locations/nar/datasets"]
        ImportAPI["/api/locations/nar/import"]
    end

    subgraph NAR Import Service
        Scanner[File Scanner]
        Reader[CSV Stream Reader]
        Joiner[Address+Location Joiner]
        Converter[Coordinate Converter]
        Filter[Filter Pipeline]
        Importer[Bulk Importer]
    end

    subgraph File System
        DataDir[/data/NAR Files]
        AddressFiles[Address_XX_part_*.csv]
        LocationFiles[Location_XX.csv]
    end

    subgraph Database
        LocationsDB[(Locations)]
        AddressesDB[(Addresses)]
    end

    subgraph External Services
        Proj4[Proj4 Library]
        EPSG3347[EPSG:3347 Definition]
    end

    Admin --> LocationsPage
    LocationsPage --> DatasetsAPI
    LocationsPage --> ImportAPI

    DatasetsAPI --> Scanner
    Scanner --> DataDir

    ImportAPI --> Reader
    Reader --> AddressFiles
    Reader --> LocationFiles

    Reader --> Joiner
    Joiner --> Converter
    Converter --> Proj4
    Proj4 --> EPSG3347

    Converter --> Filter
    Filter --> Importer
    Importer --> LocationsDB
    Importer --> AddressesDB

Data Flow:

  1. Dataset Discovery:

    • Scan /data directory for NAR CSV files
    • Group by province code (10-62)
    • Identify multi-part Address files
    • Return available datasets
  2. Import Initiation:

    • Admin selects province + filters
    • API creates import job
    • Begins streaming CSV files
  3. File Processing:

    • Read Address files (all parts sequentially)
    • Read Location file (parallel)
    • Join on LOC_GUID (in-memory map)
  4. Coordinate Conversion:

    • Extract BG_X/BG_Y from Location file
    • Convert EPSG:3347 → WGS84 using Proj4
    • Fallback to BG_LATITUDE/BG_LONGITUDE if conversion fails
  5. Filtering:

    • City filter (exact match on MUNICIPALITY)
    • Postal code filter (prefix match)
    • Cut filter (point-in-polygon)
    • Residential filter (BU_USE = 1)
  6. Database Import:

    • UPSERT Locations by locGuid (prevent duplicates)
    • INSERT Addresses with foreign key
    • Batch commits (500 records)
    • Track progress and errors

NAR File Format

File Structure

Directory Layout:

/data/
├── Address_10.csv                  # Newfoundland
├── Address_11.csv                  # PEI
├── Address_12.csv                  # Nova Scotia
├── Address_13.csv                  # New Brunswick
├── Address_24_part_1.csv           # Quebec (multi-part)
├── Address_24_part_2.csv
├── Address_24_part_3.csv
├── Address_24_part_4.csv
├── Address_24_part_5.csv
├── Address_24_part_6.csv
├── Address_35_part_1.csv           # Ontario (multi-part)
├── Address_35_part_2.csv
├── ...
├── Location_10.csv
├── Location_11.csv
├── Location_12.csv
├── Location_13.csv
├── Location_24.csv
├── Location_35.csv
└── ...

Address File Schema

File: Address_XX_part_Y.csv

ADDR_GUID,LOC_GUID,CIVIC_NO,OFFICIAL_STREET_NAME,POSTAL_CODE,MUNICIPALITY,PROVINCE_CODE
{uuid},{uuid},123,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},125,MAIN ST,M5H2N2,TORONTO,35
{uuid},{uuid},127,MAIN ST,M5H2N2,TORONTO,35

Key Fields:

Field Type Description Example
ADDR_GUID UUID Unique address identifier {12345678-...}
LOC_GUID UUID Location identifier (FK) {87654321-...}
CIVIC_NO String Street number 123, 123A, 123-125
OFFICIAL_STREET_NAME String Street name (uppercase) MAIN ST, YONGE ST
POSTAL_CODE String Canadian postal code (no space) M5H2N2, K1A0B1
MUNICIPALITY String City/town name TORONTO, OTTAWA
PROVINCE_CODE Integer Province code (10-62) 35 (Ontario)

Record Count:

  • Small provinces: 10k-50k addresses
  • Medium provinces: 50k-200k addresses
  • Large provinces: 200k-1M+ addresses (multi-part files)

Location File Schema

File: Location_XX.csv

LOC_GUID,BG_LATITUDE,BG_LONGITUDE,BG_X,BG_Y,FED_NUM,BU_USE,MUNICIPALITY
{uuid},43.6532,-79.3832,1234567.89,234567.89,35001,1,TORONTO
{uuid},43.6540,-79.3825,1234600.00,234600.00,35001,1,TORONTO

Key Fields:

Field Type Description Example
LOC_GUID UUID Unique location identifier {87654321-...}
BG_LATITUDE Float Latitude (WGS84) 43.6532
BG_LONGITUDE Float Longitude (WGS84) -79.3832
BG_X Float X coord (EPSG:3347 Lambert) 1234567.89
BG_Y Float Y coord (EPSG:3347 Lambert) 234567.89
FED_NUM String Federal electoral district 35001, 24050
BU_USE Integer Building use code 1 = Residential
MUNICIPALITY String City/town name TORONTO

Coordinate Systems:

  • BG_LATITUDE/BG_LONGITUDE: WGS84 decimal degrees (EPSG:4326)
  • BG_X/BG_Y: Statistics Canada Lambert Conformal Conic (EPSG:3347)
  • 2025 NAR Change: Primary coordinates shifted from lat/lng to BG_X/BG_Y

Building Use Codes:

Code Description
1 Residential
2 Commercial
3 Industrial
4 Institutional
5 Parks/Recreation
9 Other

Database Models

Location Model Extensions

model Location {
  id          Int      @id @default(autoincrement())
  address     String
  latitude    Float?
  longitude   Float?
  postalCode  String?
  province    String?

  // NAR-specific fields
  locGuid           String?  @unique  // NAR LOC_GUID (UUID)
  federalDistrict   String?           // NAR FED_NUM
  buildingUse       Int?              // NAR BU_USE code
  municipality      String?           // NAR MUNICIPALITY

  // Geocoding metadata (populated during import)
  geocodeConfidence Int?     @default(100)  // NAR = high confidence
  geocodeProvider   String?  @default("NAR")
  geocodedAt        DateTime?

  addresses   Address[]

  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt

  @@index([locGuid])
  @@index([federalDistrict])
  @@index([buildingUse])
  @@index([postalCode])
}

Address Model Extensions

model Address {
  id         Int      @id @default(autoincrement())
  locationId Int
  location   Location @relation(fields: [locationId], references: [id], onDelete: Cascade)

  // NAR-specific fields
  addrGuid    String?  @unique  // NAR ADDR_GUID (UUID)
  unitNumber  String?           // NAR CIVIC_NO (if multi-unit)

  // Voter data (future)
  firstName    String?
  lastName     String?
  supportLevel Int?

  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt

  @@index([locationId])
  @@index([addrGuid])
}

UPSERT Strategy:

// Prevent duplicates on re-import
const location = await prisma.location.upsert({
  where: { locGuid: narRecord.LOC_GUID },
  update: {
    address: narRecord.addressString,
    latitude: coords.latitude,
    longitude: coords.longitude,
    postalCode: narRecord.POSTAL_CODE,
    province: provinceMap[narRecord.PROVINCE_CODE],
    federalDistrict: narRecord.FED_NUM,
    buildingUse: narRecord.BU_USE,
    municipality: narRecord.MUNICIPALITY,
    geocodeProvider: 'NAR',
    geocodedAt: new Date()
  },
  create: {
    locGuid: narRecord.LOC_GUID,
    address: narRecord.addressString,
    latitude: coords.latitude,
    longitude: coords.longitude,
    postalCode: narRecord.POSTAL_CODE,
    province: provinceMap[narRecord.PROVINCE_CODE],
    federalDistrict: narRecord.FED_NUM,
    buildingUse: narRecord.BU_USE,
    municipality: narRecord.MUNICIPALITY,
    geocodeConfidence: 100,
    geocodeProvider: 'NAR',
    geocodedAt: new Date()
  }
});

API Endpoints

GET /api/locations/nar/datasets

Scan NAR data directory and return available province datasets.

Authentication: Required (SUPER_ADMIN, MAP_ADMIN)

Response:

{
  "datasets": [
    {
      "provinceCode": "10",
      "provinceName": "Newfoundland and Labrador",
      "addressFiles": ["Address_10.csv"],
      "locationFile": "Location_10.csv",
      "addressFileCount": 1,
      "estimatedRecords": 15000,
      "lastModified": "2025-01-15T00:00:00Z"
    },
    {
      "provinceCode": "24",
      "provinceName": "Quebec",
      "addressFiles": [
        "Address_24_part_1.csv",
        "Address_24_part_2.csv",
        "Address_24_part_3.csv",
        "Address_24_part_4.csv",
        "Address_24_part_5.csv",
        "Address_24_part_6.csv"
      ],
      "locationFile": "Location_24.csv",
      "addressFileCount": 6,
      "estimatedRecords": 850000,
      "lastModified": "2025-01-20T00:00:00Z"
    },
    {
      "provinceCode": "35",
      "provinceName": "Ontario",
      "addressFiles": [
        "Address_35_part_1.csv",
        "Address_35_part_2.csv",
        "Address_35_part_3.csv"
      ],
      "locationFile": "Location_35.csv",
      "addressFileCount": 3,
      "estimatedRecords": 1200000,
      "lastModified": "2025-01-22T00:00:00Z"
    }
  ],
  "dataDir": "/data",
  "totalDatasets": 13
}

Implementation:

// nar-import.service.ts

async scanDatasets(): Promise<NARDataset[]> {
  const files = await fs.readdir(NAR_DATA_DIR);

  // Group files by province code
  const provinceGroups: Record<string, { address: string[], location: string }> = {};

  files.forEach(file => {
    const addressMatch = file.match(/^Address_(\d+)(?:_part_\d+)?\.csv$/);
    const locationMatch = file.match(/^Location_(\d+)\.csv$/);

    if (addressMatch) {
      const code = addressMatch[1];
      if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
      provinceGroups[code].address.push(file);
    } else if (locationMatch) {
      const code = locationMatch[1];
      if (!provinceGroups[code]) provinceGroups[code] = { address: [], location: '' };
      provinceGroups[code].location = file;
    }
  });

  // Build dataset objects
  const datasets: NARDataset[] = [];

  for (const [code, group] of Object.entries(provinceGroups)) {
    if (group.address.length === 0 || !group.location) continue;

    const stats = await fs.stat(path.join(NAR_DATA_DIR, group.location));

    datasets.push({
      provinceCode: code,
      provinceName: PROVINCE_NAMES[code],
      addressFiles: group.address.sort(),
      locationFile: group.location,
      addressFileCount: group.address.length,
      estimatedRecords: await this.estimateRecordCount(group.address),
      lastModified: stats.mtime.toISOString()
    });
  }

  return datasets.sort((a, b) => a.provinceCode.localeCompare(b.provinceCode));
}

POST /api/locations/nar/import

Start NAR import job with filters.

Authentication: Required (SUPER_ADMIN, MAP_ADMIN)

Request Body:

{
  "provinceCode": "35",
  "city": "TORONTO",
  "postalCodePrefix": "M5",
  "cutId": 42,
  "residentialOnly": true
}

Parameters:

Parameter Type Required Description
provinceCode string Yes Province code (10-62)
city string No Filter by MUNICIPALITY (exact match, uppercase)
postalCodePrefix string No Filter by postal code prefix (e.g., "M5", "K1A")
cutId number No Filter by cut boundary (point-in-polygon)
residentialOnly boolean No Only import BU_USE = 1 (default: false)

Response:

{
  "jobId": "nar-import-35-20250213-103000",
  "status": "processing",
  "provinceCode": "35",
  "provinceName": "Ontario",
  "filters": {
    "city": "TORONTO",
    "postalCodePrefix": "M5",
    "cutId": 42,
    "residentialOnly": true
  },
  "startedAt": "2025-02-13T10:30:00Z",
  "estimatedCompletion": "2025-02-13T10:45:00Z"
}

GET /api/locations/nar/import/:jobId

Check import job progress.

Authentication: Required (SUPER_ADMIN, MAP_ADMIN)

Response (In Progress):

{
  "jobId": "nar-import-35-20250213-103000",
  "status": "processing",
  "progress": {
    "total": 1200000,
    "processed": 600000,
    "imported": 580000,
    "skipped": 15000,
    "errors": 5000,
    "percent": 50.0
  },
  "currentFile": "Address_35_part_2.csv",
  "startedAt": "2025-02-13T10:30:00Z",
  "estimatedCompletion": "2025-02-13T10:45:00Z"
}

Response (Complete):

{
  "jobId": "nar-import-35-20250213-103000",
  "status": "completed",
  "result": {
    "total": 1200000,
    "processed": 1200000,
    "imported": 1150000,
    "skipped": 45000,
    "errors": 5000,
    "percent": 100.0
  },
  "statistics": {
    "locationsCreated": 800000,
    "locationsUpdated": 350000,
    "addressesCreated": 1150000,
    "avgConfidence": 100,
    "processingTime": "14m 32s"
  },
  "startedAt": "2025-02-13T10:30:00Z",
  "completedAt": "2025-02-13T10:44:32Z"
}

Status Values:

  • queued: Job created, waiting to start
  • processing: Import in progress
  • completed: Import finished successfully
  • failed: Import failed with errors
  • cancelled: Import cancelled by user

Configuration

Environment Variables

Variable Type Default Description
NAR_DATA_DIR string /data Directory containing NAR CSV files
NAR_BATCH_SIZE number 500 Records per database transaction
NAR_IMPORT_TIMEOUT number 3600000 Import timeout in ms (1 hour)

Province Codes

Complete mapping of NAR province codes:

// nar-import.service.ts

const PROVINCE_NAMES: Record<string, string> = {
  '10': 'Newfoundland and Labrador',
  '11': 'Prince Edward Island',
  '12': 'Nova Scotia',
  '13': 'New Brunswick',
  '24': 'Quebec',
  '35': 'Ontario',
  '46': 'Manitoba',
  '47': 'Saskatchewan',
  '48': 'Alberta',
  '59': 'British Columbia',
  '60': 'Yukon',
  '61': 'Northwest Territories',
  '62': 'Nunavut'
};

const PROVINCE_ABBREVIATIONS: Record<string, string> = {
  '10': 'NL',
  '11': 'PE',
  '12': 'NS',
  '13': 'NB',
  '24': 'QC',
  '35': 'ON',
  '46': 'MB',
  '47': 'SK',
  '48': 'AB',
  '59': 'BC',
  '60': 'YT',
  '61': 'NT',
  '62': 'NU'
};

Coordinate Projection

EPSG:3347 Definition (Statistics Canada Lambert Conformal Conic):

import proj4 from 'proj4';

// Define EPSG:3347 projection
proj4.defs('EPSG:3347', '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 +lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs');

// Convert function
const convertCoordinates = (bgX: number, bgY: number): [number, number] => {
  // Input: [X, Y] in EPSG:3347 (meters)
  // Output: [longitude, latitude] in WGS84 (degrees)
  return proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
};

Projection Parameters:

  • Type: Lambert Conformal Conic
  • Standard Parallels: 49°N, 77°N
  • Central Meridian: -91.866667°
  • Origin: 63.390675°N, -91.866667°W
  • False Easting: 6,200,000 m
  • False Northing: 3,000,000 m
  • Ellipsoid: GRS80
  • Units: Meters

Example Conversion:

// Toronto City Hall coordinates
const bgX = 609091.8;  // EPSG:3347 X
const bgY = 4834610.7; // EPSG:3347 Y

const [lng, lat] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);
// Result: lng = -79.3832, lat = 43.6532

Import Workflow

Prepare NAR Files

Step 1: Download NAR Data

  1. Visit Elections Canada NAR portal: https://www.elections.ca/NAR
  2. Select "2025 National Address Register"
  3. Download province-specific CSV files
  4. Extract ZIP archives

Step 2: Upload Files to Server

# Create data directory if not exists
mkdir -p /path/to/data

# Upload files via SCP
scp Address_35_*.csv user@server:/path/to/data/
scp Location_35.csv user@server:/path/to/data/

# Or mount volume in Docker
# docker-compose.yml:
volumes:
  - ./data:/data:ro

Step 3: Verify File Integrity

# Check file count
ls -l /path/to/data/Address_35_*.csv | wc -l

# Check Location file exists
ls -l /path/to/data/Location_35.csv

# Sample first few rows
head -5 /path/to/data/Address_35_part_1.csv
head -5 /path/to/data/Location_35.csv

Run Import via Admin UI

Step 1: Navigate to NAR Import Tab

  1. Log in as SUPER_ADMIN or MAP_ADMIN
  2. Click MapLocations in sidebar
  3. Click NAR Import tab
  4. Available datasets load automatically

Step 2: Select Province

┌─────────────────────────────────────────┐
│ Available NAR Datasets                  │
├─────────────────────────────────────────┤
│ Province         │ Files │ Records      │
├──────────────────┼───────┼──────────────┤
│ Ontario (35)     │   3   │ 1,200,000    │
│ Quebec (24)      │   6   │   850,000    │
│ Alberta (48)     │   2   │   450,000    │
└──────────────────┴───────┴──────────────┘

[Select Province: Ontario ▼]

Step 3: Configure Filters (Optional)

Filters (Optional):

City:                [TORONTO          ]
  Filter by exact municipality name (uppercase)

Postal Code Prefix:  [M5               ]
  Filter by postal code prefix (2-3 chars)

Cut Boundary:        [Downtown Core ▼  ]
  Only import locations within cut polygon

☑ Residential Only
  Only import buildings with BU_USE = 1

Step 4: Review Import Summary

Import Summary:

Province:      Ontario (35)
Files:         Address_35_part_1.csv
               Address_35_part_2.csv
               Address_35_part_3.csv
               Location_35.csv

Filters:
  City:               TORONTO
  Postal Code:        M5
  Cut:                Downtown Core
  Residential Only:   Yes

Estimated Records:  ~50,000 (after filters)
Estimated Time:     ~3 minutes

[Cancel] [Start Import]

Step 5: Monitor Progress

Import in Progress...

Current File: Address_35_part_2.csv
Progress: 600,000 / 1,200,000 (50%)

[████████████░░░░░░░░░░░░] 50%

Statistics:
  Processed:  600,000
  Imported:   580,000
  Skipped:    15,000
  Errors:     5,000

[Cancel Import]

Step 6: Review Results

Import Complete!

Final Statistics:
  Total Processed:     1,200,000
  Successfully Imported: 1,150,000
  Skipped (Filters):      45,000
  Errors:                  5,000

Details:
  Locations Created:    800,000
  Locations Updated:    350,000
  Addresses Created:  1,150,000

  Processing Time:      14m 32s
  Avg Records/Second:   1,375

[View Import Log] [Import Another Province] [Close]

Import via API

Step 1: Get Available Datasets

curl -X GET http://localhost:4000/api/locations/nar/datasets \
  -H "Authorization: Bearer $TOKEN"

Step 2: Start Import

curl -X POST http://localhost:4000/api/locations/nar/import \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "provinceCode": "35",
    "city": "TORONTO",
    "postalCodePrefix": "M5",
    "residentialOnly": true
  }'

Step 3: Poll Job Status

JOB_ID="nar-import-35-20250213-103000"

while true; do
  STATUS=$(curl -s -X GET \
    http://localhost:4000/api/locations/nar/import/$JOB_ID \
    -H "Authorization: Bearer $TOKEN" \
    | jq -r '.status')

  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi

  sleep 5
done

# Get final result
curl -X GET http://localhost:4000/api/locations/nar/import/$JOB_ID \
  -H "Authorization: Bearer $TOKEN" | jq

Coordinate Conversion

Proj4 Integration

Installation:

npm install proj4
# TypeScript types included in package

Service Implementation:

// nar-import.service.ts

import proj4 from 'proj4';

// Define EPSG:3347 (Statistics Canada Lambert)
proj4.defs('EPSG:3347',
  '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
  '+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
  '+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);

interface Coordinates {
  latitude: number;
  longitude: number;
}

class NARImportService {
  /**
   * Convert NAR BG_X/BG_Y (EPSG:3347) to WGS84 lat/lng
   */
  convertCoordinates(bgX: number, bgY: number): Coordinates | null {
    try {
      // Validate inputs
      if (!bgX || !bgY || bgX < 0 || bgY < 0) {
        logger.warn('Invalid BG_X/BG_Y coordinates:', { bgX, bgY });
        return null;
      }

      // Convert: EPSG:3347 → WGS84
      const [longitude, latitude] = proj4('EPSG:3347', 'WGS84', [bgX, bgY]);

      // Validate output (Canada bounds)
      if (
        latitude < 41.0 || latitude > 84.0 ||   // Canada latitude range
        longitude < -141.0 || longitude > -52.0 // Canada longitude range
      ) {
        logger.warn('Converted coordinates outside Canada:', { latitude, longitude });
        return null;
      }

      return { latitude, longitude };
    } catch (error) {
      logger.error('Coordinate conversion failed:', error);
      return null;
    }
  }

  /**
   * Get coordinates from NAR record (try BG_X/BG_Y, fallback to lat/lng)
   */
  getCoordinates(narLocation: NARLocationRecord): Coordinates | null {
    // Primary: Convert BG_X/BG_Y
    if (narLocation.BG_X && narLocation.BG_Y) {
      const coords = this.convertCoordinates(narLocation.BG_X, narLocation.BG_Y);
      if (coords) return coords;
    }

    // Fallback: Use BG_LATITUDE/BG_LONGITUDE directly
    if (narLocation.BG_LATITUDE && narLocation.BG_LONGITUDE) {
      return {
        latitude: narLocation.BG_LATITUDE,
        longitude: narLocation.BG_LONGITUDE
      };
    }

    return null;
  }
}

Conversion Examples

Example 1: Toronto City Hall

const bgX = 609091.8;
const bgY = 4834610.7;

const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 43.6532, longitude: -79.3832 }

Example 2: Parliament Hill, Ottawa

const bgX = 447384.4;
const bgY = 5030660.5;

const coords = convertCoordinates(bgX, bgY);
// Result: { latitude: 45.4236, longitude: -75.7009 }

Example 3: Invalid Coordinates

const bgX = -1000;  // Negative (invalid)
const bgY = 0;      // Zero (invalid)

const coords = convertCoordinates(bgX, bgY);
// Result: null

Validation

Canada Bounds Check:

const isWithinCanada = (lat: number, lng: number): boolean => {
  return (
    lat >= 41.0 && lat <= 84.0 &&     // Latitude: Pelee Island to Alert
    lng >= -141.0 && lng <= -52.0     // Longitude: Yukon to Newfoundland
  );
};

Precision Check:

// NAR coordinates should have 2-6 decimal places
const hasValidPrecision = (value: number): boolean => {
  const str = value.toString();
  const decimals = str.split('.')[1]?.length || 0;
  return decimals >= 2 && decimals <= 6;
};

Multi-Part File Handling

Large Province Processing

Quebec (Province Code 24):

  • 6 Address files: Address_24_part_1.csv through Address_24_part_6.csv
  • 1 Location file: Location_24.csv
  • Total records: ~850,000

Ontario (Province Code 35):

  • 3 Address files: Address_35_part_1.csv through Address_35_part_3.csv
  • 1 Location file: Location_35.csv
  • Total records: ~1,200,000

Sequential File Reading

// nar-import.service.ts

async processAddressFiles(provinceCode: string): Promise<Map<string, AddressRecord[]>> {
  const addressMap = new Map<string, AddressRecord[]>();

  // Find all Address files for province
  const files = await fs.readdir(NAR_DATA_DIR);
  const addressFiles = files
    .filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
    .sort(); // Ensure part_1, part_2, ... order

  logger.info(`Processing ${addressFiles.length} address files for province ${provinceCode}`);

  // Process each file sequentially
  for (const file of addressFiles) {
    logger.info(`Reading ${file}...`);

    const filePath = path.join(NAR_DATA_DIR, file);
    const stream = fs.createReadStream(filePath);
    const parser = stream.pipe(csvParser());

    let rowCount = 0;

    for await (const row of parser) {
      const locGuid = row.LOC_GUID;

      if (!addressMap.has(locGuid)) {
        addressMap.set(locGuid, []);
      }

      addressMap.get(locGuid)!.push({
        addrGuid: row.ADDR_GUID,
        civicNo: row.CIVIC_NO,
        streetName: row.OFFICIAL_STREET_NAME,
        postalCode: row.POSTAL_CODE,
        municipality: row.MUNICIPALITY
      });

      rowCount++;

      if (rowCount % 10000 === 0) {
        logger.debug(`Processed ${rowCount} addresses from ${file}`);
      }
    }

    logger.info(`Completed ${file}: ${rowCount} addresses`);
  }

  logger.info(`Total unique locations: ${addressMap.size}`);
  return addressMap;
}

Memory Management

Streaming Strategy:

// Process files in chunks to avoid memory overflow
async processInChunks(
  addressMap: Map<string, AddressRecord[]>,
  locationFile: string,
  batchSize: number = 500
): Promise<ImportResult> {
  const locationPath = path.join(NAR_DATA_DIR, locationFile);
  const stream = fs.createReadStream(locationPath);
  const parser = stream.pipe(csvParser());

  let batch: LocationImport[] = [];
  let stats = { imported: 0, skipped: 0, errors: 0 };

  for await (const row of parser) {
    const locGuid = row.LOC_GUID;
    const addresses = addressMap.get(locGuid);

    if (!addresses || addresses.length === 0) {
      stats.skipped++;
      continue;
    }

    // Apply filters
    if (!this.passesFilters(row, addresses)) {
      stats.skipped++;
      continue;
    }

    // Convert coordinates
    const coords = this.getCoordinates(row);
    if (!coords) {
      stats.errors++;
      continue;
    }

    batch.push({ location: row, addresses, coords });

    // Import batch when full
    if (batch.length >= batchSize) {
      await this.importBatch(batch);
      stats.imported += batch.length;
      batch = [];
    }
  }

  // Import remaining
  if (batch.length > 0) {
    await this.importBatch(batch);
    stats.imported += batch.length;
  }

  return stats;
}

Batch Transaction:

async importBatch(batch: LocationImport[]): Promise<void> {
  await prisma.$transaction(async (tx) => {
    for (const item of batch) {
      // Upsert location
      const location = await tx.location.upsert({
        where: { locGuid: item.location.LOC_GUID },
        update: {
          address: this.formatAddress(item.addresses[0]),
          latitude: item.coords.latitude,
          longitude: item.coords.longitude,
          postalCode: item.addresses[0].postalCode,
          federalDistrict: item.location.FED_NUM,
          buildingUse: parseInt(item.location.BU_USE),
          municipality: item.location.MUNICIPALITY,
          geocodedAt: new Date()
        },
        create: {
          locGuid: item.location.LOC_GUID,
          address: this.formatAddress(item.addresses[0]),
          latitude: item.coords.latitude,
          longitude: item.coords.longitude,
          postalCode: item.addresses[0].postalCode,
          federalDistrict: item.location.FED_NUM,
          buildingUse: parseInt(item.location.BU_USE),
          municipality: item.location.MUNICIPALITY,
          geocodeConfidence: 100,
          geocodeProvider: 'NAR',
          geocodedAt: new Date()
        }
      });

      // Insert addresses
      for (const addr of item.addresses) {
        await tx.address.upsert({
          where: { addrGuid: addr.addrGuid },
          update: { locationId: location.id },
          create: {
            addrGuid: addr.addrGuid,
            locationId: location.id,
            unitNumber: addr.civicNo
          }
        });
      }
    }
  });
}

Code Examples

LocationsPage - NAR Import Tab

// LocationsPage.tsx

import React, { useEffect, useState } from 'react';
import { Tabs, Table, Button, Select, Input, Checkbox, Card, Progress, message } from 'antd';
import { UploadOutlined } from '@ant-design/icons';
import { api } from '@/lib/api';

const NARImportTab: React.FC = () => {
  const [datasets, setDatasets] = useState<NARDataset[]>([]);
  const [selectedProvince, setSelectedProvince] = useState<string | null>(null);
  const [filters, setFilters] = useState({
    city: '',
    postalCodePrefix: '',
    cutId: null as number | null,
    residentialOnly: true
  });
  const [importing, setImporting] = useState(false);
  const [progress, setProgress] = useState<ImportProgress | null>(null);
  const [jobId, setJobId] = useState<string | null>(null);

  useEffect(() => {
    fetchDatasets();
  }, []);

  useEffect(() => {
    if (jobId && importing) {
      const interval = setInterval(pollProgress, 2000);
      return () => clearInterval(interval);
    }
  }, [jobId, importing]);

  const fetchDatasets = async () => {
    try {
      const { data } = await api.get<{ datasets: NARDataset[] }>('/locations/nar/datasets');
      setDatasets(data.datasets);
    } catch (error) {
      message.error('Failed to load NAR datasets');
    }
  };

  const pollProgress = async () => {
    if (!jobId) return;

    try {
      const { data } = await api.get(`/locations/nar/import/${jobId}`);

      if (data.status === 'completed') {
        setImporting(false);
        setProgress(null);
        message.success(`Import complete! Imported ${data.result.imported} locations.`);
      } else if (data.status === 'failed') {
        setImporting(false);
        setProgress(null);
        message.error('Import failed. Check logs for details.');
      } else {
        setProgress(data.progress);
      }
    } catch (error) {
      message.error('Failed to fetch import progress');
    }
  };

  const startImport = async () => {
    if (!selectedProvince) {
      message.warning('Please select a province');
      return;
    }

    try {
      const { data } = await api.post('/locations/nar/import', {
        provinceCode: selectedProvince,
        ...filters
      });

      setJobId(data.jobId);
      setImporting(true);
      message.info('Import started...');
    } catch (error) {
      message.error('Failed to start import');
    }
  };

  const datasetColumns = [
    { title: 'Province', dataIndex: 'provinceName', key: 'name' },
    { title: 'Files', dataIndex: 'addressFileCount', key: 'files' },
    { title: 'Estimated Records', dataIndex: 'estimatedRecords', key: 'records',
      render: (val: number) => val.toLocaleString() },
    { title: 'Last Modified', dataIndex: 'lastModified', key: 'modified',
      render: (val: string) => new Date(val).toLocaleDateString() }
  ];

  return (
    <div>
      <Card title="Available NAR Datasets" style={{ marginBottom: 24 }}>
        <Table
          dataSource={datasets}
          columns={datasetColumns}
          rowKey="provinceCode"
          pagination={false}
          onRow={(record) => ({
            onClick: () => setSelectedProvince(record.provinceCode),
            style: {
              cursor: 'pointer',
              backgroundColor: selectedProvince === record.provinceCode ? '#e6f7ff' : undefined
            }
          })}
        />
      </Card>

      {selectedProvince && (
        <Card title="Import Configuration">
          <div style={{ marginBottom: 16 }}>
            <label>Province: </label>
            <strong>{datasets.find(d => d.provinceCode === selectedProvince)?.provinceName}</strong>
          </div>

          <div style={{ marginBottom: 16 }}>
            <label>City (Optional): </label>
            <Input
              style={{ width: 300 }}
              placeholder="TORONTO"
              value={filters.city}
              onChange={e => setFilters({ ...filters, city: e.target.value.toUpperCase() })}
            />
          </div>

          <div style={{ marginBottom: 16 }}>
            <label>Postal Code Prefix (Optional): </label>
            <Input
              style={{ width: 200 }}
              placeholder="M5"
              value={filters.postalCodePrefix}
              onChange={e => setFilters({ ...filters, postalCodePrefix: e.target.value.toUpperCase() })}
            />
          </div>

          <div style={{ marginBottom: 16 }}>
            <Checkbox
              checked={filters.residentialOnly}
              onChange={e => setFilters({ ...filters, residentialOnly: e.target.checked })}
            >
              Residential Only
            </Checkbox>
          </div>

          <Button
            type="primary"
            icon={<UploadOutlined />}
            onClick={startImport}
            loading={importing}
            disabled={importing}
          >
            Start Import
          </Button>
        </Card>
      )}

      {importing && progress && (
        <Card title="Import Progress" style={{ marginTop: 24 }}>
          <Progress percent={progress.percent} status="active" />
          <div style={{ marginTop: 16 }}>
            <p>Processed: {progress.processed.toLocaleString()} / {progress.total.toLocaleString()}</p>
            <p>Imported: {progress.imported.toLocaleString()}</p>
            <p>Skipped: {progress.skipped.toLocaleString()}</p>
            <p>Errors: {progress.errors.toLocaleString()}</p>
          </div>
        </Card>
      )}
    </div>
  );
};

NAR Import Service - Full Implementation

// nar-import.service.ts

import fs from 'fs/promises';
import path from 'path';
import csvParser from 'csv-parser';
import proj4 from 'proj4';
import { prisma } from '@/config/database';
import { logger } from '@/utils/logger';

// Define EPSG:3347
proj4.defs('EPSG:3347',
  '+proj=lcc +lat_1=49 +lat_2=77 +lat_0=63.390675 ' +
  '+lon_0=-91.86666666666666 +x_0=6200000 +y_0=3000000 ' +
  '+ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs'
);

const NAR_DATA_DIR = process.env.NAR_DATA_DIR || '/data';
const BATCH_SIZE = parseInt(process.env.NAR_BATCH_SIZE || '500');

interface NARAddressRecord {
  ADDR_GUID: string;
  LOC_GUID: string;
  CIVIC_NO: string;
  OFFICIAL_STREET_NAME: string;
  POSTAL_CODE: string;
  MUNICIPALITY: string;
}

interface NARLocationRecord {
  LOC_GUID: string;
  BG_LATITUDE?: number;
  BG_LONGITUDE?: number;
  BG_X?: number;
  BG_Y?: number;
  FED_NUM: string;
  BU_USE: string;
  MUNICIPALITY: string;
}

export class NARImportService {
  async importProvince(
    provinceCode: string,
    filters: {
      city?: string;
      postalCodePrefix?: string;
      cutId?: number;
      residentialOnly?: boolean;
    }
  ): Promise<ImportResult> {
    logger.info(`Starting NAR import for province ${provinceCode}`, { filters });

    // Load address files into memory map
    const addressMap = await this.loadAddressFiles(provinceCode, filters);

    // Process location file and import
    const result = await this.processLocationFile(provinceCode, addressMap, filters);

    logger.info(`NAR import complete for province ${provinceCode}`, result);
    return result;
  }

  private async loadAddressFiles(
    provinceCode: string,
    filters: { city?: string; postalCodePrefix?: string }
  ): Promise<Map<string, NARAddressRecord[]>> {
    const addressMap = new Map<string, NARAddressRecord[]>();

    const files = await fs.readdir(NAR_DATA_DIR);
    const addressFiles = files
      .filter(f => f.match(new RegExp(`^Address_${provinceCode}(?:_part_\\d+)?\\.csv$`)))
      .sort();

    for (const file of addressFiles) {
      logger.info(`Reading ${file}...`);
      const filePath = path.join(NAR_DATA_DIR, file);
      const stream = require('fs').createReadStream(filePath);
      const parser = stream.pipe(csvParser());

      for await (const row of parser) {
        // Apply filters
        if (filters.city && row.MUNICIPALITY !== filters.city) continue;
        if (filters.postalCodePrefix && !row.POSTAL_CODE.startsWith(filters.postalCodePrefix)) continue;

        const locGuid = row.LOC_GUID;
        if (!addressMap.has(locGuid)) {
          addressMap.set(locGuid, []);
        }
        addressMap.get(locGuid)!.push(row);
      }
    }

    logger.info(`Loaded ${addressMap.size} unique locations`);
    return addressMap;
  }

  private async processLocationFile(
    provinceCode: string,
    addressMap: Map<string, NARAddressRecord[]>,
    filters: { cutId?: number; residentialOnly?: boolean }
  ): Promise<ImportResult> {
    const locationFile = `Location_${provinceCode}.csv`;
    const filePath = path.join(NAR_DATA_DIR, locationFile);
    const stream = require('fs').createReadStream(filePath);
    const parser = stream.pipe(csvParser());

    let batch: any[] = [];
    const stats = { imported: 0, skipped: 0, errors: 0, total: 0 };

    for await (const row of parser) {
      stats.total++;

      const locGuid = row.LOC_GUID;
      const addresses = addressMap.get(locGuid);

      if (!addresses || addresses.length === 0) {
        stats.skipped++;
        continue;
      }

      // Residential filter
      if (filters.residentialOnly && parseInt(row.BU_USE) !== 1) {
        stats.skipped++;
        continue;
      }

      // Convert coordinates
      const coords = this.getCoordinates(row);
      if (!coords) {
        stats.errors++;
        continue;
      }

      // Cut filter (if specified)
      if (filters.cutId) {
        const cut = await prisma.cut.findUnique({ where: { id: filters.cutId } });
        if (cut && !this.isPointInPolygon([coords.longitude, coords.latitude], cut.geojson)) {
          stats.skipped++;
          continue;
        }
      }

      batch.push({ location: row, addresses, coords });

      if (batch.length >= BATCH_SIZE) {
        await this.importBatch(batch);
        stats.imported += batch.length;
        batch = [];
      }
    }

    if (batch.length > 0) {
      await this.importBatch(batch);
      stats.imported += batch.length;
    }

    return stats;
  }

  private getCoordinates(row: NARLocationRecord): { latitude: number; longitude: number } | null {
    // Try BG_X/BG_Y conversion
    if (row.BG_X && row.BG_Y) {
      try {
        const [lng, lat] = proj4('EPSG:3347', 'WGS84', [row.BG_X, row.BG_Y]);
        if (lat >= 41 && lat <= 84 && lng >= -141 && lng <= -52) {
          return { latitude: lat, longitude: lng };
        }
      } catch (error) {
        logger.warn('Coordinate conversion failed:', error);
      }
    }

    // Fallback to BG_LATITUDE/BG_LONGITUDE
    if (row.BG_LATITUDE && row.BG_LONGITUDE) {
      return { latitude: row.BG_LATITUDE, longitude: row.BG_LONGITUDE };
    }

    return null;
  }

  private async importBatch(batch: any[]): Promise<void> {
    await prisma.$transaction(async (tx) => {
      for (const item of batch) {
        const location = await tx.location.upsert({
          where: { locGuid: item.location.LOC_GUID },
          update: {
            address: this.formatAddress(item.addresses[0]),
            latitude: item.coords.latitude,
            longitude: item.coords.longitude,
            postalCode: item.addresses[0].POSTAL_CODE,
            federalDistrict: item.location.FED_NUM,
            buildingUse: parseInt(item.location.BU_USE),
            municipality: item.location.MUNICIPALITY
          },
          create: {
            locGuid: item.location.LOC_GUID,
            address: this.formatAddress(item.addresses[0]),
            latitude: item.coords.latitude,
            longitude: item.coords.longitude,
            postalCode: item.addresses[0].POSTAL_CODE,
            federalDistrict: item.location.FED_NUM,
            buildingUse: parseInt(item.location.BU_USE),
            municipality: item.location.MUNICIPALITY,
            geocodeConfidence: 100,
            geocodeProvider: 'NAR'
          }
        });

        for (const addr of item.addresses) {
          await tx.address.upsert({
            where: { addrGuid: addr.ADDR_GUID },
            update: {},
            create: {
              addrGuid: addr.ADDR_GUID,
              locationId: location.id,
              unitNumber: addr.CIVIC_NO
            }
          });
        }
      }
    });
  }

  private formatAddress(addr: NARAddressRecord): string {
    return `${addr.CIVIC_NO} ${addr.OFFICIAL_STREET_NAME}`.trim();
  }

  private isPointInPolygon(point: [number, number], geojson: any): boolean {
    // Point-in-polygon implementation
    // (Same as in spatial.ts)
    return true; // Placeholder
  }
}

Troubleshooting

Problem: No datasets found

Symptoms:

  • GET /api/locations/nar/datasets returns empty array
  • "No datasets available" message in admin

Solutions:

  1. Verify NAR_DATA_DIR path:
echo $NAR_DATA_DIR
ls -la /data
  1. Check Docker volume mount:
# docker-compose.yml
services:
  api:
    volumes:
      - ./data:/data:ro
  1. Verify file naming convention:
# Correct:
Address_35_part_1.csv
Location_35.csv

# Incorrect:
address_35.csv  # Lowercase
Addresses_35.csv  # Plural
Address35.csv  # No underscore
  1. Check file permissions:
chmod 644 /data/Address_*.csv
chmod 644 /data/Location_*.csv

Problem: Coordinate conversion errors

Symptoms:

  • Many locations skipped during import
  • "Converted coordinates outside Canada" warnings
  • Null latitude/longitude in database

Solutions:

  1. Verify BG_X/BG_Y values:
// Valid range for Canada (EPSG:3347):
// BG_X: ~400,000 to 3,000,000
// BG_Y: ~4,600,000 to 9,000,000

console.log('BG_X:', narRecord.BG_X);  // Should be 6-7 digits
console.log('BG_Y:', narRecord.BG_Y);  // Should be 7 digits
  1. Test with known coordinates:
// Toronto City Hall
const [lng, lat] = proj4('EPSG:3347', 'WGS84', [609091.8, 4834610.7]);
console.log('Expected: 43.6532, -79.3832');
console.log('Got:', lat, lng);
  1. Fallback to BG_LATITUDE/BG_LONGITUDE:
// If BG_X/BG_Y missing or invalid, use lat/lng directly
if (!coords && narRecord.BG_LATITUDE && narRecord.BG_LONGITUDE) {
  coords = {
    latitude: narRecord.BG_LATITUDE,
    longitude: narRecord.BG_LONGITUDE
  };
}
  1. Check proj4 definition:
npm list proj4
# Ensure version 2.8.0+

Problem: Import very slow (> 30min for 100k records)

Symptoms:

  • Import hangs on large provinces
  • Memory usage grows over time
  • Database connection timeouts

Solutions:

  1. Increase batch size:
NAR_BATCH_SIZE=1000  # Default: 500
  1. Use streaming instead of loading all addresses:
// DON'T do this (loads all into memory):
const allAddresses = await readAllAddressFiles();

// DO this (stream and process incrementally):
for await (const addressBatch of streamAddressFiles()) {
  processBatch(addressBatch);
}
  1. Optimize database indexes:
CREATE INDEX CONCURRENTLY idx_locations_loc_guid ON "Location"(locGuid);
CREATE INDEX CONCURRENTLY idx_addresses_addr_guid ON "Address"(addrGuid);
  1. Disable geocoding during import:
// Skip geocoding service since NAR already has coordinates
geocodeConfidence: 100,
geocodeProvider: 'NAR'
// No call to geocodingService.geocode()
  1. Use worker threads for parallel processing:
import { Worker } from 'worker_threads';

const workers = [];
for (let i = 0; i < 4; i++) {
  const worker = new Worker('./nar-import-worker.js');
  workers.push(worker);
}

Problem: Duplicate LOC_GUID errors

Symptoms:

  • Unique constraint violation on locGuid
  • Import fails mid-process
  • "Duplicate key value violates unique constraint" error

Solutions:

  1. Use UPSERT instead of INSERT:
await prisma.location.upsert({
  where: { locGuid: narRecord.LOC_GUID },
  update: { /* update fields */ },
  create: { /* create fields */ }
});
  1. Check for corrupt NAR files:
# Count unique LOC_GUIDs
cut -d, -f2 Address_35_part_1.csv | sort | uniq | wc -l

# Check for duplicates
cut -d, -f2 Address_35_part_1.csv | sort | uniq -d
  1. Clean up partial imports:
-- Delete locations from failed import
DELETE FROM "Location" WHERE "geocodeProvider" = 'NAR' AND "createdAt" > '2025-02-13';
  1. Implement transaction rollback on error:
try {
  await prisma.$transaction(async (tx) => {
    // Import batch
  });
} catch (error) {
  logger.error('Batch failed, rolling back:', error);
  // Transaction automatically rolled back
}

Performance Considerations

Import Speed

Benchmarks:

Province Records Files Time Records/Second
PEI (11) 15,000 1 12s 1,250
Nova Scotia (12) 85,000 1 1m 10s 1,214
Quebec (24) 850,000 6 11m 20s 1,250
Ontario (35) 1,200,000 3 14m 30s 1,379

Factors:

  • Batch size: 500 (optimal for most systems)
  • Coordinate conversion: ~0.1ms per record
  • Database write: ~0.5ms per location (depends on disk speed)
  • Total overhead: ~0.7ms per record

Memory Usage

Peak Memory:

  • Address map (in-memory): ~200MB per 100k records
  • CSV parser buffer: ~10MB
  • Batch buffer: ~5MB (500 records)
  • Total: ~220MB per 100k records

Optimization:

  • Stream address files instead of loading all
  • Process location file in chunks
  • Clear batch after each commit
  • Limit concurrent transactions

Database Load

Transaction Rate:

  • 1 transaction per batch (500 records)
  • ~2-3 transactions/second
  • Low database CPU (~10-20%)
  • Moderate disk I/O (sequential writes)

Connection Pool:

// prisma/schema.prisma
datasource db {
  url = env("DATABASE_URL")
  connection_limit = 10
}

Backend Documentation

  • NAR Import Service: api/src/modules/map/locations/nar-import.service.ts

    • File scanning
    • Streaming CSV parser
    • Coordinate conversion
    • Batch import
  • NAR Import Routes: api/src/modules/map/locations/nar-import.routes.ts

    • Dataset discovery
    • Import job creation
    • Progress tracking
  • Locations Service: api/src/modules/map/locations/locations.service.ts

    • Location CRUD
    • Geocoding integration

Frontend Documentation

  • Locations Page: admin/src/pages/LocationsPage.tsx
    • NAR Import tab
    • Dataset selection
    • Filter configuration
    • Progress monitoring

Database Documentation

  • Location Model: api/prisma/schema.prisma

    • NAR-specific fields
    • locGuid unique constraint
    • Federal district index
  • Address Model: api/prisma/schema.prisma

    • addrGuid unique constraint
    • Location foreign key

External Resources