11 KiB
Gmail to Google Sheets Export Script - Function Documentation
Overview
This Google Apps Script automatically exports OCM (Office of City Manager) contact form submissions from Gmail to Google Sheets. The script processes emails with a specific label, parses contact form data, and creates a structured spreadsheet with real-time output and checkpoint/resume functionality.
Configuration Constants
Email Processing
LABEL_NAME: Gmail label to process ("ocm_autopull")BATCH_SIZE: Number of emails to write before updating checkpoint (10)MAX_EXECUTION_TIME: Maximum script runtime before checkpoint (5 minutes)
Spreadsheet Setup
SPREADSHEET_NAME_PREFIX: Prefix for generated spreadsheet namesMAIN_SHEET_NAME: Name of the main data sheetERROR_SHEET_NAME: Name of the error logging sheetLOG_SHEET_NAME: Name of the processing log sheet
Data Validation
REQUIRED_OCM_PHRASES: Keywords that must be present in OCM emailsREPLY_INDICATORS: Subject line patterns that indicate replies/forwardsSKIP_SENDER_DOMAINS: Email domains to skip (e.g., "edmonton.ca")EXCLUDE_COUNTRIES: Countries to exclude from full address construction
Main Functions
downloadEmailsByLabel()
Primary export function that orchestrates the entire process
Purpose: Main entry point for exporting emails by label to Google Sheets with real-time output and checkpoint/resume capability.
Process:
- Checks for existing checkpoint to determine if resuming or starting fresh
- Creates or connects to spreadsheet and initializes sheets
- Scans Gmail for all threads with the specified label
- Processes each thread's root email sequentially
- Saves checkpoints periodically and on timeout
- Formats final spreadsheet and sends completion notification
Returns: Spreadsheet URL
Key Features:
- Real-time data output (writes immediately to sheets)
- Automatic checkpoint saving every 10 emails
- Time-based checkpointing (5-minute limit)
- Automatic resume scheduling on timeout
- Progress logging and error handling
- Email notification on completion
Sheet Management Functions
initializeSheets(spreadsheet, isResume)
Creates or connects to spreadsheet sheets based on operation type
Parameters:
spreadsheet: The Google Spreadsheet objectisResume: Boolean indicating if this is a resume operation
Fresh Start Mode (isResume = false):
- Creates new sheets with unique timestamped names
- Clears any existing content
- Sets up headers and formatting
- Freezes header rows
Resume Mode (isResume = true):
- Connects to existing sheets without clearing data
- Preserves all previously processed data
- Creates missing sheets as fallback
- Logs number of existing data rows
Creates Three Sheets:
- Main Data Sheet: OCM contact form data
- Error Sheet: Processing errors and exceptions
- Log Sheet: Detailed processing activity log
getUniqueSheetNames()
Generates unique sheet names with timestamps to avoid conflicts
Returns: Object containing unique names for main, error, and log sheets
writeRowToSheet(sheet, rowData)
Writes a single row of data immediately to the specified sheet
Parameters:
sheet: Target Google Sheet objectrowData: Array of values to write
Purpose: Enables real-time output by writing each processed email immediately rather than batching.
formatSheet(sheet, totalRows)
Applies formatting to improve spreadsheet readability
Formatting Applied:
- Auto-resizes all columns
- Sets specific column widths for OCM data fields
- Applies alternating row banding (light grey)
- Formats date columns with consistent date/time format
- Optimizes layout for OCM contact form structure
Data Processing Functions
getAllThreadsForLabel(label)
Retrieves ALL email threads for a Gmail label, handling pagination
Parameters:
label: Gmail Label object
Process:
- Handles Gmail's 500-thread pagination limit
- Retrieves threads in batches with progress logging
- Includes brief pauses between batches to avoid rate limits
Returns: Array of all Gmail Thread objects
parseOCMContactForm(message, thread, threadNumber)
Parses OCM contact form data from email body using regex patterns
Parameters:
message: Gmail Message objectthread: Gmail Thread objectthreadNumber: Sequential thread number for logging
Extracted Fields:
- Confirmation Number
- Submission Date
- Personal Info: First Name, Last Name, Email, Phone, Fax
- Organization details
- Address: Street, City, Province, Country, Postal Code
- Subject and Comments
- Gmail metadata: Date, From, Subject, Thread ID, Message ID
Data Cleaning:
- Removes email quote markers and formatting artifacts
- Normalizes whitespace and removes colons
- Cleans email addresses from mailto links
- Handles empty organization and fax fields
- Strips footer text from comments
Filtering:
- Skips obvious replies/forwards based on subject indicators
- Excludes emails from specified domains (e.g., edmonton.ca)
- Processes only root emails from each thread
Error Handling:
- Returns error row with available metadata if parsing fails
- Logs detailed error information for troubleshooting
constructFullAddress(street, streetCont, city, province, country, postal)
Builds formatted full address from individual components
Address Construction Rules:
- Validates and cleans each component
- Excludes specified countries (e.g., "Canada")
- Removes field prefixes that leaked through parsing
- Joins components with commas
- Handles empty or duplicate fields gracefully
Returns: Formatted address string
Logging and Error Handling
logMessage(action, threadNumber, message, details)
Logs processing activities to the log sheet and console
Parameters:
action: Type of action (START, PROCESS, SUCCESS, ERROR, etc.)threadNumber: Current thread numbermessage: Primary log messagedetails: Additional details (optional)
Log Entry Format:
- Timestamp
- Thread Number
- Action Type
- Message with Details
- Status (SUCCESS/ERROR)
logError(threadNumber, errorType, errorDetails, emailDate, emailFrom)
Logs errors to the dedicated error sheet
Error Entry Format:
- Thread Number
- Error Type
- Detailed Error Description
- Email Date and Sender
- Error Timestamp
Common Error Types:
- Parsing Error: Failed to extract form data
- Thread Processing Error: General thread handling failure
- Sheet Writing Error: Failed to write to spreadsheet
Checkpoint and Resume System
saveCheckpoint(checkpointData)
Saves current progress to Google Apps Script Properties
Checkpoint Data Includes:
- Spreadsheet ID
- Label name
- Total thread count
- Number of threads processed
- Number of emails found
- Start timestamp
getCheckpoint()
Retrieves saved checkpoint data
Returns: Checkpoint object or null if no checkpoint exists
clearCheckpoint()
Removes checkpoint data when export completes
scheduleResume()
Creates time-based trigger to automatically resume processing
Process:
- Deletes any existing resume triggers
- Creates new trigger to run
resumeExport()after specified delay - Default delay: 1 minute
resumeExport()
Triggered function that resumes export from checkpoint
Process:
- Cleans up the trigger that called it
- Calls
downloadEmailsByLabel()to continue processing
Testing and Utility Functions
testLabelCount()
Comprehensive readiness test - RUN THIS FIRST
Test Sequence:
- Label Existence: Verifies the specified Gmail label exists
- Thread Count: Gets total number of threads with the label
- Message Analysis: Analyzes sample threads for size and structure
- Parsing Test: Tests email parsing on sample messages
Output:
- Detailed test results and statistics
- Available Gmail labels if target label not found
- Thread size analysis (min/max/average messages per thread)
- Parsing success rate on sample data
testSmallSample()
Processes 3 sample emails with real-time output - RUN THIS SECOND
Purpose:
- Validates parsing logic with real data
- Tests spreadsheet creation and formatting
- Verifies real-time output functionality
- Creates test spreadsheet for review
Process:
- Creates test spreadsheet with all sheets
- Processes up to 3 labeled emails
- Applies full formatting
- Returns test spreadsheet URL
checkCheckpointStatus()
Displays current checkpoint information
Shows:
- Spreadsheet ID and URL
- Processing progress (threads/emails)
- Start time and label name
- Resume instructions
clearCheckpointAndTriggers()
Emergency function to clear stuck checkpoints and triggers
Use Cases:
- Stuck or corrupted checkpoint data
- Multiple resume triggers created
- Starting completely fresh export
getCurrentSpreadsheetUrl()
Gets URL of currently active export spreadsheet
Returns: Spreadsheet URL or null if no active export
Data Structure
Main Sheet Columns
- Thread Number: Sequential processing number
- Confirmation Number: OCM form confirmation ID
- Submission Date: Date form was submitted
- First Name: Submitter's first name
- Last Name: Submitter's last name
- Email Address: Contact email
- Phone Number: Contact phone
- Fax: Fax number (often empty)
- Organization: Organization name (often empty)
- Street Address: Primary street address
- Street Address (cont.): Address continuation
- City: City name
- Province: Province/state
- Country: Country name
- Postal Code: Postal/ZIP code
- Full Address: Constructed complete address
- Subject: Form subject line
- Comments: Form comments/message
- Gmail Date: Email received date
- Gmail From: Email sender
- Gmail Subject: Email subject line
- Thread ID: Gmail thread identifier
- Email ID: Gmail message identifier
Usage Instructions
Initial Setup
- Update
LABEL_NAMEconstant with your Gmail label - Run
testLabelCount()to verify setup - Run
testSmallSample()to test parsing - Review test results before full export
Full Export
- Run
downloadEmailsByLabel()for complete export - Monitor console logs for progress
- Export automatically resumes if interrupted
- Check email for completion notification
Troubleshooting
- Use
checkCheckpointStatus()to view progress - Use
clearCheckpointAndTriggers()to reset if stuck - Check error and log sheets for detailed information
- Verify Gmail label exists and contains expected emails
Performance Notes
- Processes ~10-20 emails per minute depending on content
- Automatically checkpoints every 5 minutes
- Resumes automatically with 1-minute delay
- Real-time output shows immediate results
- Handles large exports (1000+ emails) through checkpointing