37 KiB
Media Job Queue System
Overview
The Media Job Queue System provides asynchronous background processing for CPU and GPU-intensive video operations. Built on a custom job queue with resource-aware scheduling, it handles everything from directory scanning to AI-powered video analysis while maintaining system stability through resource category management.
Key Features:
- Resource Categories — Jobs classified by resource needs (CPU, GPU encode, GPU AI)
- Priority Scheduling — High-priority jobs processed first within same category
- Job Types — 15+ job types (compilation, encoding, digest generation, scene extraction, etc.)
- Progress Tracking — Real-time progress updates (0-100%)
- Status Management — Pending → Queued → Running → Completed/Failed lifecycle
- Retry Logic — Failed jobs can be retried with exponential backoff
- Detailed Logging — Execution logs for debugging and audit trail
- Queue Management — Pause, resume, cancel, and prioritize jobs
- VRAM Awareness — Prevents GPU memory exhaustion by tracking VRAM requirements
Access Control:
- Job viewing/management requires
SUPER_ADMINrole - Job creation can be triggered by admins or automated workflows
Technology Stack:
- Database Queue — PostgreSQL-backed job queue (no BullMQ for media)
- Worker Process — Node.js worker polling queue every 5 seconds
- FFmpeg — Video encoding and compilation
- AI Integration — Future support for scene detection and auto-tagging
Architecture
flowchart TB
subgraph "Job Creation"
A1[Admin Action]
A2[Automated Trigger]
A3[Scheduled Task]
end
subgraph "Job Queue (PostgreSQL)"
Q1[Pending Jobs]
Q2[Queued Jobs]
Q3[Running Jobs]
Q4[Completed/Failed Jobs]
end
subgraph "Worker Process"
W1[Job Poller<br/>Every 5s]
W2[Resource Checker]
W3[Job Executor]
W4[Progress Updater]
end
subgraph "Processors"
P1[CPU Jobs<br/>scan, validate]
P2[GPU Encode<br/>reencode, compile]
P3[GPU AI<br/>digest, tag, scene]
end
subgraph "Results"
R1[Video Records Updated]
R2[New Files Created]
R3[Logs Written]
end
A1 --> Q1
A2 --> Q1
A3 --> Q1
Q1 --> W1
W1 --> W2
W2 -->|Check Resources| Q2
Q2 --> W3
W3 --> P1
W3 --> P2
W3 --> P3
W3 --> W4
W4 --> Q3
P1 --> R1
P2 --> R2
P3 --> R3
Q3 --> Q4
style Q1 fill:#f9f
style Q3 fill:#ff9
style Q4 fill:#9f9
Workflow:
- Job Creation — Admin clicks "Re-encode" button, API creates job record
- Queue Polling — Worker checks for pending jobs every 5 seconds
- Resource Check — Worker verifies sufficient VRAM/CPU available
- Job Execution — Worker runs appropriate processor (FFmpeg, AI script, etc.)
- Progress Updates — Worker updates job progress every ~5% completion
- Completion — Worker marks job complete and logs results
- Retry on Failure — Failed jobs can be retried with exponential backoff
Database Model
Jobs Table Schema
// api/src/modules/media/db/schema.ts
export const jobs = pgTable('jobs', {
id: uuid('id').primaryKey().defaultRandom(),
// Job Definition
type: text('type').notNull(), // JobType enum: compilation, scan, reencode, etc.
status: text('status').notNull().default('pending'), // JobStatus enum
params: jsonb('params').$type<Record<string, any>>().notNull(), // Job-specific parameters
// Progress Tracking
progress: integer('progress').default(0), // 0-100
log: text('log').default(''), // Execution log (append-only)
// Scheduling
priority: integer('priority').default(5), // 1 (highest) - 10 (lowest)
queuePosition: integer('queue_position'), // Position in queue
waitingReason: text('waiting_reason'), // Why job is waiting (e.g., "Insufficient VRAM")
// Resource Management
resourceCategory: text('resource_category').notNull(), // cpu|gpu_encode|gpu_ai
vramRequired: integer('vram_required').default(0), // MB of VRAM needed
// Timing
createdAt: timestamp('created_at').defaultNow(),
startedAt: timestamp('started_at'),
completedAt: timestamp('completed_at'),
// Retry Logic
retryCount: integer('retry_count').default(0),
maxRetries: integer('max_retries').default(3),
retryAfter: timestamp('retry_after'), // Don't retry before this time
});
Job Types Enum
| Type | Resource Category | VRAM (MB) | Description |
|---|---|---|---|
scan |
cpu | 0 | Scan directory for new videos |
public_scan |
cpu | 0 | Scan public gallery directory |
validate |
cpu | 0 | Validate video metadata (FFprobe) |
reencode_streaming |
gpu_encode | 4000 | Re-encode for web playback (H.264) |
compile_random |
gpu_encode | 2000 | Random video compilation |
compile_quad |
gpu_encode | 4000 | 4-up grid compilation |
compile_mega |
gpu_encode | 6000 | Large multi-video compilation |
compile_gif |
cpu | 0 | Create GIF from video |
digest_generate |
gpu_ai | 8000 | AI-powered video digest |
clip_generate |
gpu_ai | 6000 | Extract clips from digest |
highlight_generate |
gpu_ai | 8000 | Create highlight reel |
tag_generation |
gpu_ai | 6000 | AI auto-tagging |
scene_extract |
gpu_ai | 8000 | Scene detection and extraction |
thumbnail_generate |
cpu | 0 | Generate thumbnail from video |
move_to_library |
cpu | 0 | Move video from inbox to target directory |
Job Status Enum
| Status | Description | Final State |
|---|---|---|
pending |
Waiting to be picked up by worker | No |
queued |
Selected by worker, waiting for resources | No |
running |
Currently executing | No |
completed |
Finished successfully | Yes |
failed |
Execution failed (see log for details) | Yes |
cancelled |
Manually cancelled by admin | Yes |
paused |
Temporarily paused (can be resumed) | No |
Resource Categories
| Category | Typical VRAM | Concurrent Limit | Use Cases |
|---|---|---|---|
cpu |
0 MB | 5 | Scanning, validation, simple encodes, GIF creation |
gpu_encode |
2-6 GB | 2 | Video re-encoding, compilation, format conversion |
gpu_ai |
6-12 GB | 1 | AI tagging, scene detection, digest generation, highlight extraction |
VRAM Management:
Worker tracks total VRAM usage across running jobs:
const runningJobs = await db.select().from(jobs).where(eq(jobs.status, 'running'));
const totalVramUsed = runningJobs.reduce((sum, job) => sum + (job.vramRequired || 0), 0);
// Only start new job if VRAM available
const TOTAL_VRAM = 16000; // 16GB GPU
if (totalVramUsed + newJob.vramRequired <= TOTAL_VRAM) {
startJob(newJob);
}
API Endpoints
All endpoints require SUPER_ADMIN role.
List Jobs
GET /api/media/jobs
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
page |
number | 1 | Page number |
limit |
number | 20 | Results per page |
status |
string | - | Filter by status (pending, running, completed, failed) |
type |
string | - | Filter by job type |
resourceCategory |
string | - | Filter by resource category |
Response:
{
"data": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"type": "reencode_streaming",
"status": "running",
"progress": 45,
"resourceCategory": "gpu_encode",
"vramRequired": 4000,
"priority": 5,
"params": {
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"targetBitrate": 2000
},
"startedAt": "2026-02-13T10:30:00Z",
"createdAt": "2026-02-13T10:25:00Z"
}
],
"pagination": {
"page": 1,
"limit": 20,
"total": 156,
"totalPages": 8
}
}
Get Job Details
GET /api/media/jobs/:id
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"type": "reencode_streaming",
"status": "completed",
"progress": 100,
"log": "Starting re-encode...\nFFmpeg command: ffmpeg -i input.mp4 -c:v h264 -preset medium -crf 23 output.mp4\nProgress: 25%\nProgress: 50%\nProgress: 75%\nProgress: 100%\nCompleted successfully",
"params": {
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"inputPath": "inbox/original.mp4",
"outputPath": "playback/encoded.mp4",
"targetBitrate": 2000
},
"resourceCategory": "gpu_encode",
"vramRequired": 4000,
"priority": 5,
"retryCount": 0,
"maxRetries": 3,
"createdAt": "2026-02-13T10:25:00Z",
"startedAt": "2026-02-13T10:30:00Z",
"completedAt": "2026-02-13T10:45:00Z"
}
Create Job
POST /api/media/jobs
Request Body:
{
"type": "reencode_streaming",
"params": {
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"targetBitrate": 2000
},
"priority": 5,
"resourceCategory": "gpu_encode",
"vramRequired": 4000
}
Response:
{
"id": "770e8400-e29b-41d4-a716-446655440002",
"type": "reencode_streaming",
"status": "pending",
"progress": 0,
"createdAt": "2026-02-13T11:00:00Z"
}
Retry Failed Job
POST /api/media/jobs/:id/retry
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"retryCount": 1,
"retryAfter": null,
"log": "Starting re-encode...\n[Previous logs...]\n--- RETRY ATTEMPT 1 ---\n"
}
Retry Logic:
- Failed jobs can be retried up to
maxRetriestimes (default: 3) - Exponential backoff: wait
2^retryCountminutes before retry - Retry resets status to
pendingand appends retry marker to log
Cancel Job
POST /api/media/jobs/:id/cancel
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "cancelled",
"log": "Starting re-encode...\nProgress: 25%\n--- JOB CANCELLED BY ADMIN ---"
}
Notes:
- Running jobs cannot be cancelled immediately (worker must finish current chunk)
- Pending/queued jobs cancelled instantly
Pause/Resume Job
POST /api/media/jobs/:id/pause
POST /api/media/jobs/:id/resume
Pause Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "paused"
}
Resume Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending"
}
Queue Statistics
GET /api/media/jobs/stats
Response:
{
"pending": 12,
"queued": 2,
"running": 3,
"completed": 1458,
"failed": 23,
"paused": 1,
"totalVramUsed": 12000,
"totalVramAvailable": 16000,
"averageProcessingTime": 245,
"jobsByType": {
"reencode_streaming": 45,
"scan": 8,
"compile_random": 12
}
}
Admin Workflow
Viewing Job Queue
- Navigate to Media → Jobs in admin sidebar
- Table displays all jobs with:
- Job type icon
- Status badge (color-coded)
- Progress bar
- Priority indicator
- Resource category
- Created/started/completed times
- Use filters at top:
- Status dropdown (All / Pending / Running / Completed / Failed)
- Type dropdown (job type)
- Resource dropdown (CPU / GPU Encode / GPU AI)
Creating Jobs Manually
Option 1: From Library Page
- Select video in library table
- Click "Actions" dropdown
- Select action:
- "Re-encode for Streaming"
- "Generate Thumbnail"
- "Validate Metadata"
- "Move to Directory"
- Confirm job creation
- Redirected to Jobs page showing new job
Option 2: From Jobs Page
- Click "Create Job" button
- Modal opens with form:
- Type dropdown (15+ job types)
- Video selector (search by title/filename)
- Priority slider (1-10)
- Parameters JSON editor (advanced)
- Click "Create"
- Job appears in pending queue
Monitoring Job Progress
Real-Time Updates:
- Jobs page polls API every 2 seconds for running jobs
- Progress bars update smoothly (0-100%)
- Status badges change color:
- Grey: Pending
- Blue: Queued
- Yellow: Running
- Green: Completed
- Red: Failed
Detailed Logs:
- Click job row to expand details panel
- View execution log in monospace text area
- Log updates in real-time while job running
- Example log output:
[2026-02-13 10:30:15] Starting re-encode job
[2026-02-13 10:30:16] Input: /media/local/inbox/original.mp4
[2026-02-13 10:30:16] Output: /media/local/playback/encoded.mp4
[2026-02-13 10:30:17] FFmpeg command: ffmpeg -i /media/local/inbox/original.mp4 -c:v libx264 -preset medium -crf 23 -c:a aac -b:a 128k /media/local/playback/encoded.mp4
[2026-02-13 10:30:20] Progress: 5%
[2026-02-13 10:30:25] Progress: 15%
[2026-02-13 10:30:30] Progress: 25%
...
[2026-02-13 10:45:00] Progress: 100%
[2026-02-13 10:45:01] Re-encode completed successfully
[2026-02-13 10:45:02] Output file size: 25.3 MB
Retrying Failed Jobs
- Filter for Failed jobs
- Click job row to view error log
- Identify failure reason (e.g., "FFmpeg error: codec not supported")
- Fix underlying issue (install codec, fix file path, etc.)
- Click "Retry" button
- Job resets to pending status
- Worker picks up job again
Auto-Retry:
Jobs automatically retry up to 3 times with exponential backoff:
- 1st retry: after 2 minutes
- 2nd retry: after 4 minutes
- 3rd retry: after 8 minutes
Cancelling Jobs
- Find job in pending/queued/running state
- Click "Cancel" button
- Confirm cancellation dialog
- Job marked as cancelled
- If running, worker stops after current chunk completes
Pausing/Resuming Jobs
Use Case: Temporarily stop low-priority jobs to free resources for urgent tasks
- Select low-priority pending job
- Click "Pause" button
- Job status changes to paused (greyed out)
- Worker skips paused jobs
- When ready, click "Resume"
- Job returns to pending queue
Job Type Details
Scan Jobs (scan, public_scan)
Purpose: Scan filesystem directory for new videos and create database records
Parameters:
{
"directoryType": "videos",
"skipExisting": true
}
Process:
- Read directory
/media/local/library/{directoryType}/ - Filter for video extensions (
.mp4,.mov, etc.) - Check each file against database (by path)
- Create records for new files
- Run FFprobe on new files
- Update progress: files processed / total files
Typical Duration: 2-30 seconds (depends on file count)
Validation Jobs (validate)
Purpose: Re-run FFprobe to refresh video metadata
Parameters:
{
"videoId": "660e8400-e29b-41d4-a716-446655440001"
}
Process:
- Fetch video record from database
- Build full file path
- Run FFprobe extraction
- Update database with fresh metadata
- Mark video as valid/invalid based on result
Typical Duration: 100-500ms per video
Re-encode Jobs (reencode_streaming)
Purpose: Convert video to web-optimized format (H.264, web-friendly profile)
Parameters:
{
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"targetBitrate": 2000,
"preset": "medium",
"crf": 23
}
FFmpeg Command:
ffmpeg -i /media/local/inbox/original.mp4 \
-c:v libx264 \
-preset medium \
-crf 23 \
-maxrate 2000k \
-bufsize 4000k \
-c:a aac \
-b:a 128k \
-movflags +faststart \
/media/local/playback/encoded.mp4
Process:
- Validate input file exists
- Build FFmpeg command
- Start encoding process
- Parse FFmpeg progress output
- Update job progress every ~5%
- Create new video record for encoded file
- Update original video
reencodeJobIdreference
Typical Duration: 5-30 minutes (depends on video length and resolution)
Compilation Jobs (compile_random, compile_quad, compile_mega)
Purpose: Merge multiple videos into single compilation
Parameters (Random):
{
"count": 10,
"minDuration": 30,
"maxDuration": 120,
"orientation": "landscape",
"outputPath": "compilations/random-001.mp4"
}
Process:
- Query database for videos matching criteria (orientation, duration range)
- Randomly select
countvideos - Build FFmpeg concat demuxer file list
- Run FFmpeg compilation
- Create new video record for compilation
- Update progress based on FFmpeg output
Quad Compilation (4-up grid):
ffmpeg -i video1.mp4 -i video2.mp4 -i video3.mp4 -i video4.mp4 \
-filter_complex "[0:v][1:v][2:v][3:v]xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0[v]" \
-map "[v]" \
output.mp4
Typical Duration: 10-60 minutes
Digest Generation (digest_generate)
Purpose: AI-powered video digest creation (future feature)
Parameters:
{
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"targetLength": 60,
"includeHighlights": true
}
Process (Planned):
- Extract frames at 1 FPS
- Run AI scene detection
- Identify highlights (action, faces, motion)
- Select best segments totaling target length
- Compile segments into digest video
GPU AI Required: 8GB VRAM
Thumbnail Generation (thumbnail_generate)
Purpose: Extract thumbnail image from video
Parameters:
{
"videoId": "660e8400-e29b-41d4-a716-446655440001",
"timestamp": 5,
"width": 640
}
FFmpeg Command:
ffmpeg -i /media/local/library/videos/sample.mp4 \
-ss 00:00:05 \
-vframes 1 \
-vf scale=640:-1 \
/media/local/thumbnails/sample.jpg
Process:
- Seek to timestamp (default: 25% into video)
- Extract single frame
- Scale to width (preserve aspect ratio)
- Save as JPEG
- Update video record with
thumbnailPath
Typical Duration: 1-5 seconds
Code Examples
Create Re-encode Job
// api/src/modules/media/routes/jobs.routes.ts
import { db } from '@/modules/media/db';
import { jobs, videos } from '@/modules/media/db/schema';
app.post('/api/media/jobs/reencode', async (req, reply) => {
const { videoId, targetBitrate = 2000, preset = 'medium', crf = 23 } = req.body;
// Fetch video
const [video] = await db
.select()
.from(videos)
.where(eq(videos.id, videoId))
.limit(1);
if (!video) {
return reply.code(404).send({ error: 'Video not found' });
}
// Create job
const [job] = await db
.insert(jobs)
.values({
type: 'reencode_streaming',
status: 'pending',
params: {
videoId,
inputPath: video.path,
outputPath: `playback/${video.filename}`,
targetBitrate,
preset,
crf,
},
resourceCategory: 'gpu_encode',
vramRequired: 4000,
priority: 5,
})
.returning();
reply.send(job);
});
Job Worker (Polling Loop)
// api/src/modules/media/services/job-worker.service.ts
import { db } from '@/modules/media/db';
import { jobs } from '@/modules/media/db/schema';
import { eq, and, lte } from 'drizzle-orm';
export class JobWorkerService {
private polling = false;
async start() {
this.polling = true;
console.log('Job worker started');
while (this.polling) {
try {
await this.processNextJob();
} catch (error) {
console.error('Job worker error:', error);
}
// Wait 5 seconds before next poll
await new Promise((resolve) => setTimeout(resolve, 5000));
}
}
async stop() {
this.polling = false;
console.log('Job worker stopped');
}
private async processNextJob() {
// Find next pending job (highest priority first)
const [job] = await db
.select()
.from(jobs)
.where(eq(jobs.status, 'pending'))
.orderBy(jobs.priority, jobs.createdAt)
.limit(1);
if (!job) {
return; // No jobs in queue
}
// Check resource availability
const canRun = await this.checkResources(job);
if (!canRun) {
// Update waiting reason
await db
.update(jobs)
.set({ waitingReason: 'Insufficient resources' })
.where(eq(jobs.id, job.id));
return;
}
// Start job
await this.executeJob(job);
}
private async checkResources(job: any): Promise<boolean> {
// Get running jobs
const runningJobs = await db
.select()
.from(jobs)
.where(eq(jobs.status, 'running'));
// Calculate total VRAM used
const totalVramUsed = runningJobs.reduce(
(sum, j) => sum + (j.vramRequired || 0),
0
);
const TOTAL_VRAM = 16000; // 16GB GPU
const available = TOTAL_VRAM - totalVramUsed;
if (job.vramRequired && job.vramRequired > available) {
return false; // Not enough VRAM
}
// Check concurrent job limits by category
const categoryCount = runningJobs.filter(
(j) => j.resourceCategory === job.resourceCategory
).length;
const limits = {
cpu: 5,
gpu_encode: 2,
gpu_ai: 1,
};
if (categoryCount >= limits[job.resourceCategory as keyof typeof limits]) {
return false; // Category limit reached
}
return true; // Resources available
}
private async executeJob(job: any) {
// Mark as running
await db
.update(jobs)
.set({
status: 'running',
startedAt: new Date(),
waitingReason: null,
})
.where(eq(jobs.id, job.id));
try {
// Execute job based on type
switch (job.type) {
case 'reencode_streaming':
await this.executeReencode(job);
break;
case 'scan':
await this.executeScan(job);
break;
case 'thumbnail_generate':
await this.executeThumbnail(job);
break;
// ... other job types
}
// Mark as completed
await db
.update(jobs)
.set({
status: 'completed',
progress: 100,
completedAt: new Date(),
})
.where(eq(jobs.id, job.id));
} catch (error: any) {
// Mark as failed
await db
.update(jobs)
.set({
status: 'failed',
log: (job.log || '') + `\n\n--- ERROR ---\n${error.message}`,
})
.where(eq(jobs.id, job.id));
// Schedule retry if under max retries
if (job.retryCount < job.maxRetries) {
const retryDelay = Math.pow(2, job.retryCount) * 60 * 1000; // Exponential backoff
await db
.update(jobs)
.set({
status: 'pending',
retryCount: job.retryCount + 1,
retryAfter: new Date(Date.now() + retryDelay),
})
.where(eq(jobs.id, job.id));
}
}
}
private async executeReencode(job: any) {
const { inputPath, outputPath, targetBitrate, preset, crf } = job.params;
const inputFull = path.join(process.env.MEDIA_LIBRARY_PATH!, inputPath);
const outputFull = path.join(process.env.MEDIA_LIBRARY_PATH!, outputPath);
const command = `ffmpeg -i "${inputFull}" -c:v libx264 -preset ${preset} -crf ${crf} -maxrate ${targetBitrate}k -bufsize ${targetBitrate * 2}k -c:a aac -b:a 128k -movflags +faststart "${outputFull}"`;
await this.appendLog(job.id, `Starting re-encode\nCommand: ${command}`);
// Execute FFmpeg (simplified - real implementation uses spawn for progress parsing)
await execAsync(command);
await this.appendLog(job.id, 'Re-encode completed successfully');
}
private async appendLog(jobId: string, message: string) {
const timestamp = new Date().toISOString();
const logEntry = `[${timestamp}] ${message}`;
await db
.update(jobs)
.set({
log: sql`${jobs.log} || E'\n' || ${logEntry}`,
})
.where(eq(jobs.id, jobId));
}
}
// Start worker
export const jobWorker = new JobWorkerService();
jobWorker.start();
Frontend: Jobs Page
// admin/src/pages/media/MediaJobsPage.tsx
import { Table, Tag, Progress, Button, Space, Select, message } from 'antd';
import { useEffect, useState } from 'react';
import { mediaApi } from '@/lib/media-api';
export default function MediaJobsPage() {
const [jobs, setJobs] = useState([]);
const [loading, setLoading] = useState(false);
const [filter, setFilter] = useState({ status: undefined, type: undefined });
const [polling, setPolling] = useState(true);
const fetchJobs = async () => {
setLoading(true);
try {
const { data } = await mediaApi.get('/api/media/jobs', {
params: filter,
});
setJobs(data.data);
} catch (error) {
console.error('Failed to fetch jobs:', error);
} finally {
setLoading(false);
}
};
useEffect(() => {
fetchJobs();
}, [filter]);
// Poll for running jobs every 2 seconds
useEffect(() => {
if (!polling) return;
const interval = setInterval(() => {
const hasRunning = jobs.some((j: any) => j.status === 'running');
if (hasRunning) {
fetchJobs();
}
}, 2000);
return () => clearInterval(interval);
}, [polling, jobs]);
const handleRetry = async (id: string) => {
try {
await mediaApi.post(`/api/media/jobs/${id}/retry`);
message.success('Job queued for retry');
fetchJobs();
} catch (error) {
message.error('Retry failed');
}
};
const handleCancel = async (id: string) => {
try {
await mediaApi.post(`/api/media/jobs/${id}/cancel`);
message.success('Job cancelled');
fetchJobs();
} catch (error) {
message.error('Cancel failed');
}
};
const statusColors: Record<string, string> = {
pending: 'default',
queued: 'blue',
running: 'processing',
completed: 'success',
failed: 'error',
cancelled: 'default',
paused: 'warning',
};
const columns = [
{
title: 'Type',
dataIndex: 'type',
width: 150,
render: (type: string) => <span style={{ fontFamily: 'monospace' }}>{type}</span>,
},
{
title: 'Status',
dataIndex: 'status',
width: 100,
render: (status: string) => <Tag color={statusColors[status]}>{status.toUpperCase()}</Tag>,
},
{
title: 'Progress',
dataIndex: 'progress',
width: 150,
render: (progress: number, record: any) => (
record.status === 'running' ? (
<Progress percent={progress} size="small" status="active" />
) : record.status === 'completed' ? (
<Progress percent={100} size="small" status="success" />
) : record.status === 'failed' ? (
<Progress percent={progress} size="small" status="exception" />
) : (
<Progress percent={progress} size="small" />
)
),
},
{
title: 'Resource',
dataIndex: 'resourceCategory',
width: 120,
},
{
title: 'Priority',
dataIndex: 'priority',
width: 80,
render: (priority: number) => (
<Tag color={priority <= 3 ? 'red' : priority <= 6 ? 'orange' : 'default'}>
{priority}
</Tag>
),
},
{
title: 'Created',
dataIndex: 'createdAt',
width: 150,
render: (date: string) => new Date(date).toLocaleString(),
},
{
title: 'Actions',
width: 200,
render: (_: any, record: any) => (
<Space>
{record.status === 'failed' && (
<Button size="small" onClick={() => handleRetry(record.id)}>
Retry
</Button>
)}
{['pending', 'queued', 'running'].includes(record.status) && (
<Button size="small" danger onClick={() => handleCancel(record.id)}>
Cancel
</Button>
)}
<Button size="small" onClick={() => window.open(`/app/media/jobs/${record.id}`, '_blank')}>
View Log
</Button>
</Space>
),
},
];
return (
<div>
<Space style={{ marginBottom: 16 }}>
<Select
placeholder="Filter by status"
style={{ width: 150 }}
onChange={(value) => setFilter({ ...filter, status: value })}
allowClear
>
<Select.Option value="pending">Pending</Select.Option>
<Select.Option value="running">Running</Select.Option>
<Select.Option value="completed">Completed</Select.Option>
<Select.Option value="failed">Failed</Select.Option>
</Select>
<Select
placeholder="Filter by type"
style={{ width: 200 }}
onChange={(value) => setFilter({ ...filter, type: value })}
allowClear
>
<Select.Option value="scan">Scan</Select.Option>
<Select.Option value="reencode_streaming">Re-encode</Select.Option>
<Select.Option value="compile_random">Compilation</Select.Option>
</Select>
<Button onClick={() => setPolling(!polling)}>
{polling ? 'Stop Auto-Refresh' : 'Start Auto-Refresh'}
</Button>
</Space>
<Table
columns={columns}
dataSource={jobs}
loading={loading}
rowKey="id"
pagination={{ pageSize: 20 }}
/>
</div>
);
}
Troubleshooting
Problem: Jobs Stuck in Pending
Symptoms:
- Jobs created but never start
- Status remains "pending" for hours
- No "running" jobs visible
Solutions:
- Check worker process running:
docker compose ps media-api
# Should show "Up" status
docker compose logs media-api | grep "Job worker"
# Should show "Job worker started"
- Manually trigger worker:
# Restart media-api container
docker compose restart media-api
# Worker starts automatically on container boot
- Check worker logs for errors:
docker compose logs -f media-api | grep ERROR
# Look for database connection errors, permission issues
- Verify database connection:
# Test database accessible from container
docker compose exec media-api psql $DATABASE_URL -c "SELECT COUNT(*) FROM jobs WHERE status='pending';"
Problem: Job Fails Immediately
Symptoms:
- Job status changes from pending → running → failed within seconds
- No meaningful progress
- Error in log: "Command not found" or "Permission denied"
Solutions:
- Check job log in database:
SELECT log FROM jobs WHERE id = 'JOB_ID';
- Verify FFmpeg installed:
docker compose exec media-api which ffmpeg
# Should output: /usr/bin/ffmpeg
docker compose exec media-api ffmpeg -version
- Check file paths valid:
# Verify input file exists
docker compose exec media-api ls -la /media/local/library/inbox/original.mp4
# Check output directory writable
docker compose exec media-api touch /media/local/playback/test.txt
- Test FFmpeg command manually:
# Copy command from job log, run manually
docker compose exec media-api ffmpeg -i /media/local/inbox/test.mp4 -c:v libx264 /media/local/playback/test-output.mp4
Problem: Re-encode Job Hangs at Same Progress
Symptoms:
- Job progress reaches 25%, 50%, or 75% then stops updating
- Status remains "running" for hours
- No CPU/GPU activity visible
Solutions:
- Check FFmpeg process still running:
docker compose exec media-api ps aux | grep ffmpeg
# Should show ffmpeg process
# If not running, worker crashed
docker compose logs media-api --tail 100
- Kill hung FFmpeg process:
docker compose exec media-api pkill -9 ffmpeg
# Job will fail and can be retried
- Check disk space:
df -h /media/local/playback
# If 100% full, encoding fails
# Free space
docker compose exec media-api rm /media/local/playback/*.partial
- Increase FFmpeg timeout (if very large file):
// api/src/modules/media/services/job-worker.service.ts
const FFMPEG_TIMEOUT = 3600000; // 1 hour (from 30 minutes)
Problem: GPU Out of Memory Errors
Symptoms:
- Multiple GPU jobs running simultaneously
- Error in log: "CUDA out of memory" or "Cannot allocate memory"
- System becomes unresponsive
Solutions:
- Check total VRAM available:
nvidia-smi
# Shows GPU memory usage
# Should show < 16GB used (adjust based on your GPU)
- Reduce concurrent GPU job limit:
// api/src/modules/media/services/job-worker.service.ts
const limits = {
cpu: 5,
gpu_encode: 1, // Reduced from 2
gpu_ai: 1,
};
- Increase VRAM requirements for jobs:
// Jobs require more VRAM than specified
// Update job creation to use higher vramRequired values
{
type: 'reencode_streaming',
vramRequired: 6000, // Increased from 4000
}
- Kill running GPU jobs:
# Stop all media jobs
docker compose exec media-api pkill -9 ffmpeg
# Update stuck jobs to failed status
docker compose exec v2-postgres psql -U changemaker -d v2_changemaker \
-c "UPDATE jobs SET status='failed' WHERE status='running';"
Performance Considerations
Job Queue Throughput
Scaling Factors:
- CPU jobs: 5 concurrent = ~10-20 jobs/minute (scans, validations)
- GPU encode: 2 concurrent = ~4-8 videos/hour (depends on length)
- GPU AI: 1 concurrent = ~2-6 videos/hour (depends on complexity)
Bottlenecks:
- GPU Memory — Limits concurrent GPU jobs
- Disk I/O — Reading/writing large video files
- CPU — FFmpeg encoding uses all available cores
Optimization:
- Distribute workers across multiple machines — Each machine runs separate worker process
- Use job priority — Urgent jobs (priority 1-3) run first
- Batch similar jobs — Group scan jobs, re-encode jobs, etc. for efficiency
Database Performance
Job Queue Index:
CREATE INDEX idx_jobs_status_priority ON jobs(status, priority, created_at);
Query Performance:
- Find next pending job: ~1-5ms (with index)
- Update job status: ~2-10ms
- Fetch job logs: ~5-20ms
Optimization:
- Partition jobs table by date — Move old completed/failed jobs to archive table
- Limit log size — Truncate logs > 10KB to prevent bloat
Monitoring & Observability
Prometheus Metrics
// api/src/utils/metrics.ts
import { Counter, Gauge } from 'prom-client';
export const mediaJobsTotal = new Counter({
name: 'media_jobs_total',
help: 'Total media jobs created',
labelNames: ['type', 'status'],
});
export const mediaJobsPending = new Gauge({
name: 'media_jobs_pending',
help: 'Number of pending media jobs',
});
export const mediaJobsRunning = new Gauge({
name: 'media_jobs_running',
help: 'Number of running media jobs',
labelNames: ['resourceCategory'],
});
export const mediaVramUsed = new Gauge({
name: 'media_vram_used_mb',
help: 'Total VRAM used by running jobs (MB)',
});
// Update metrics in worker
mediaJobsPending.set(pendingCount);
mediaJobsRunning.set({ resourceCategory: 'gpu_encode' }, gpuEncodeCount);
mediaVramUsed.set(totalVramUsed);
Grafana Dashboard Panel
Job Queue Status:
# Pending jobs count
media_jobs_pending
# Running jobs by category
sum(media_jobs_running) by (resourceCategory)
# VRAM usage percentage
(media_vram_used_mb / 16000) * 100
Alert Rules:
# configs/prometheus/alerts.yml
groups:
- name: media_jobs
rules:
- alert: MediaJobQueueBacklog
expr: media_jobs_pending > 50
for: 30m
labels:
severity: warning
annotations:
summary: "Media job queue backlog"
description: "{{ $value }} jobs pending for 30+ minutes"
- alert: MediaJobsStuckRunning
expr: sum(media_jobs_running) == 0 AND media_jobs_pending > 0
for: 10m
labels:
severity: critical
annotations:
summary: "Media jobs stuck"
description: "Jobs pending but worker not processing"
Related Documentation
Backend Documentation
- Job Worker:
backend/modules/media/job-worker.md— Worker process implementation - Job Processors:
backend/modules/media/processors/— Individual job type processors (reencode, scan, etc.) - Jobs Routes:
backend/modules/media/jobs.md— API endpoints for job management
Frontend Documentation
- Jobs Page:
frontend/pages/media/jobs.md— Job queue monitoring UI - Job Detail Modal:
frontend/components/media/job-detail.md— Log viewer component
Feature Documentation
- Video Library:
features/media/video-library.md— Triggering jobs from library actions - Upload System:
features/media/upload.md— Post-upload job creation
Next Steps
After mastering the job queue:
- Create Custom Jobs — Implement new job types for domain-specific processing
- Optimize Scheduling — Tune resource limits and priority settings for your workload
- Monitor Performance — Set up Grafana dashboards and alerts for job queue health
- Distributed Workers — Scale horizontally by running workers on multiple machines
Hands-On Practice:
# 1. Create re-encode job
curl -X POST http://localhost:4100/api/media/jobs \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"type": "reencode_streaming",
"params": { "videoId": "VIDEO_ID", "targetBitrate": 2000 },
"priority": 5
}'
# 2. Monitor job progress
watch -n 2 'curl -s http://localhost:4100/api/media/jobs/JOB_ID | jq ".progress"'
# 3. View job logs
curl http://localhost:4100/api/media/jobs/JOB_ID | jq -r ".log"
# 4. Check queue stats
curl http://localhost:4100/api/media/jobs/stats | jq
Last Updated: 2026-02-13 Version: V2.0 Maintainer: Changemaker Lite Team