1551 lines
37 KiB
Markdown
1551 lines
37 KiB
Markdown
# Media Job Queue System
|
|
|
|
## Overview
|
|
|
|
The Media Job Queue System provides asynchronous background processing for CPU and GPU-intensive video operations. Built on a custom job queue with resource-aware scheduling, it handles everything from directory scanning to AI-powered video analysis while maintaining system stability through resource category management.
|
|
|
|
**Key Features:**
|
|
|
|
- **Resource Categories** — Jobs classified by resource needs (CPU, GPU encode, GPU AI)
|
|
- **Priority Scheduling** — High-priority jobs processed first within same category
|
|
- **Job Types** — 15+ job types (compilation, encoding, digest generation, scene extraction, etc.)
|
|
- **Progress Tracking** — Real-time progress updates (0-100%)
|
|
- **Status Management** — Pending → Queued → Running → Completed/Failed lifecycle
|
|
- **Retry Logic** — Failed jobs can be retried with exponential backoff
|
|
- **Detailed Logging** — Execution logs for debugging and audit trail
|
|
- **Queue Management** — Pause, resume, cancel, and prioritize jobs
|
|
- **VRAM Awareness** — Prevents GPU memory exhaustion by tracking VRAM requirements
|
|
|
|
**Access Control:**
|
|
|
|
- Job viewing/management requires `SUPER_ADMIN` role
|
|
- Job creation can be triggered by admins or automated workflows
|
|
|
|
**Technology Stack:**
|
|
|
|
- **Database Queue** — PostgreSQL-backed job queue (no BullMQ for media)
|
|
- **Worker Process** — Node.js worker polling queue every 5 seconds
|
|
- **FFmpeg** — Video encoding and compilation
|
|
- **AI Integration** — Future support for scene detection and auto-tagging
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph "Job Creation"
|
|
A1[Admin Action]
|
|
A2[Automated Trigger]
|
|
A3[Scheduled Task]
|
|
end
|
|
|
|
subgraph "Job Queue (PostgreSQL)"
|
|
Q1[Pending Jobs]
|
|
Q2[Queued Jobs]
|
|
Q3[Running Jobs]
|
|
Q4[Completed/Failed Jobs]
|
|
end
|
|
|
|
subgraph "Worker Process"
|
|
W1[Job Poller<br/>Every 5s]
|
|
W2[Resource Checker]
|
|
W3[Job Executor]
|
|
W4[Progress Updater]
|
|
end
|
|
|
|
subgraph "Processors"
|
|
P1[CPU Jobs<br/>scan, validate]
|
|
P2[GPU Encode<br/>reencode, compile]
|
|
P3[GPU AI<br/>digest, tag, scene]
|
|
end
|
|
|
|
subgraph "Results"
|
|
R1[Video Records Updated]
|
|
R2[New Files Created]
|
|
R3[Logs Written]
|
|
end
|
|
|
|
A1 --> Q1
|
|
A2 --> Q1
|
|
A3 --> Q1
|
|
|
|
Q1 --> W1
|
|
W1 --> W2
|
|
W2 -->|Check Resources| Q2
|
|
Q2 --> W3
|
|
|
|
W3 --> P1
|
|
W3 --> P2
|
|
W3 --> P3
|
|
|
|
W3 --> W4
|
|
W4 --> Q3
|
|
|
|
P1 --> R1
|
|
P2 --> R2
|
|
P3 --> R3
|
|
|
|
Q3 --> Q4
|
|
|
|
style Q1 fill:#f9f
|
|
style Q3 fill:#ff9
|
|
style Q4 fill:#9f9
|
|
```
|
|
|
|
**Workflow:**
|
|
|
|
1. **Job Creation** — Admin clicks "Re-encode" button, API creates job record
|
|
2. **Queue Polling** — Worker checks for pending jobs every 5 seconds
|
|
3. **Resource Check** — Worker verifies sufficient VRAM/CPU available
|
|
4. **Job Execution** — Worker runs appropriate processor (FFmpeg, AI script, etc.)
|
|
5. **Progress Updates** — Worker updates job progress every ~5% completion
|
|
6. **Completion** — Worker marks job complete and logs results
|
|
7. **Retry on Failure** — Failed jobs can be retried with exponential backoff
|
|
|
|
---
|
|
|
|
## Database Model
|
|
|
|
### Jobs Table Schema
|
|
|
|
```typescript
|
|
// api/src/modules/media/db/schema.ts
|
|
export const jobs = pgTable('jobs', {
|
|
id: uuid('id').primaryKey().defaultRandom(),
|
|
|
|
// Job Definition
|
|
type: text('type').notNull(), // JobType enum: compilation, scan, reencode, etc.
|
|
status: text('status').notNull().default('pending'), // JobStatus enum
|
|
params: jsonb('params').$type<Record<string, any>>().notNull(), // Job-specific parameters
|
|
|
|
// Progress Tracking
|
|
progress: integer('progress').default(0), // 0-100
|
|
log: text('log').default(''), // Execution log (append-only)
|
|
|
|
// Scheduling
|
|
priority: integer('priority').default(5), // 1 (highest) - 10 (lowest)
|
|
queuePosition: integer('queue_position'), // Position in queue
|
|
waitingReason: text('waiting_reason'), // Why job is waiting (e.g., "Insufficient VRAM")
|
|
|
|
// Resource Management
|
|
resourceCategory: text('resource_category').notNull(), // cpu|gpu_encode|gpu_ai
|
|
vramRequired: integer('vram_required').default(0), // MB of VRAM needed
|
|
|
|
// Timing
|
|
createdAt: timestamp('created_at').defaultNow(),
|
|
startedAt: timestamp('started_at'),
|
|
completedAt: timestamp('completed_at'),
|
|
|
|
// Retry Logic
|
|
retryCount: integer('retry_count').default(0),
|
|
maxRetries: integer('max_retries').default(3),
|
|
retryAfter: timestamp('retry_after'), // Don't retry before this time
|
|
});
|
|
```
|
|
|
|
### Job Types Enum
|
|
|
|
| Type | Resource Category | VRAM (MB) | Description |
|
|
|------|------------------|-----------|-------------|
|
|
| `scan` | cpu | 0 | Scan directory for new videos |
|
|
| `public_scan` | cpu | 0 | Scan public gallery directory |
|
|
| `validate` | cpu | 0 | Validate video metadata (FFprobe) |
|
|
| `reencode_streaming` | gpu_encode | 4000 | Re-encode for web playback (H.264) |
|
|
| `compile_random` | gpu_encode | 2000 | Random video compilation |
|
|
| `compile_quad` | gpu_encode | 4000 | 4-up grid compilation |
|
|
| `compile_mega` | gpu_encode | 6000 | Large multi-video compilation |
|
|
| `compile_gif` | cpu | 0 | Create GIF from video |
|
|
| `digest_generate` | gpu_ai | 8000 | AI-powered video digest |
|
|
| `clip_generate` | gpu_ai | 6000 | Extract clips from digest |
|
|
| `highlight_generate` | gpu_ai | 8000 | Create highlight reel |
|
|
| `tag_generation` | gpu_ai | 6000 | AI auto-tagging |
|
|
| `scene_extract` | gpu_ai | 8000 | Scene detection and extraction |
|
|
| `thumbnail_generate` | cpu | 0 | Generate thumbnail from video |
|
|
| `move_to_library` | cpu | 0 | Move video from inbox to target directory |
|
|
|
|
### Job Status Enum
|
|
|
|
| Status | Description | Final State |
|
|
|--------|-------------|-------------|
|
|
| `pending` | Waiting to be picked up by worker | No |
|
|
| `queued` | Selected by worker, waiting for resources | No |
|
|
| `running` | Currently executing | No |
|
|
| `completed` | Finished successfully | Yes |
|
|
| `failed` | Execution failed (see log for details) | Yes |
|
|
| `cancelled` | Manually cancelled by admin | Yes |
|
|
| `paused` | Temporarily paused (can be resumed) | No |
|
|
|
|
### Resource Categories
|
|
|
|
| Category | Typical VRAM | Concurrent Limit | Use Cases |
|
|
|----------|-------------|------------------|-----------|
|
|
| `cpu` | 0 MB | 5 | Scanning, validation, simple encodes, GIF creation |
|
|
| `gpu_encode` | 2-6 GB | 2 | Video re-encoding, compilation, format conversion |
|
|
| `gpu_ai` | 6-12 GB | 1 | AI tagging, scene detection, digest generation, highlight extraction |
|
|
|
|
**VRAM Management:**
|
|
|
|
Worker tracks total VRAM usage across running jobs:
|
|
|
|
```typescript
|
|
const runningJobs = await db.select().from(jobs).where(eq(jobs.status, 'running'));
|
|
const totalVramUsed = runningJobs.reduce((sum, job) => sum + (job.vramRequired || 0), 0);
|
|
|
|
// Only start new job if VRAM available
|
|
const TOTAL_VRAM = 16000; // 16GB GPU
|
|
if (totalVramUsed + newJob.vramRequired <= TOTAL_VRAM) {
|
|
startJob(newJob);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## API Endpoints
|
|
|
|
All endpoints require `SUPER_ADMIN` role.
|
|
|
|
### List Jobs
|
|
|
|
```http
|
|
GET /api/media/jobs
|
|
```
|
|
|
|
**Query Parameters:**
|
|
|
|
| Parameter | Type | Default | Description |
|
|
|-----------|------|---------|-------------|
|
|
| `page` | number | 1 | Page number |
|
|
| `limit` | number | 20 | Results per page |
|
|
| `status` | string | - | Filter by status (pending, running, completed, failed) |
|
|
| `type` | string | - | Filter by job type |
|
|
| `resourceCategory` | string | - | Filter by resource category |
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"data": [
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"type": "reencode_streaming",
|
|
"status": "running",
|
|
"progress": 45,
|
|
"resourceCategory": "gpu_encode",
|
|
"vramRequired": 4000,
|
|
"priority": 5,
|
|
"params": {
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"targetBitrate": 2000
|
|
},
|
|
"startedAt": "2026-02-13T10:30:00Z",
|
|
"createdAt": "2026-02-13T10:25:00Z"
|
|
}
|
|
],
|
|
"pagination": {
|
|
"page": 1,
|
|
"limit": 20,
|
|
"total": 156,
|
|
"totalPages": 8
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Get Job Details
|
|
|
|
```http
|
|
GET /api/media/jobs/:id
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"type": "reencode_streaming",
|
|
"status": "completed",
|
|
"progress": 100,
|
|
"log": "Starting re-encode...\nFFmpeg command: ffmpeg -i input.mp4 -c:v h264 -preset medium -crf 23 output.mp4\nProgress: 25%\nProgress: 50%\nProgress: 75%\nProgress: 100%\nCompleted successfully",
|
|
"params": {
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"inputPath": "inbox/original.mp4",
|
|
"outputPath": "playback/encoded.mp4",
|
|
"targetBitrate": 2000
|
|
},
|
|
"resourceCategory": "gpu_encode",
|
|
"vramRequired": 4000,
|
|
"priority": 5,
|
|
"retryCount": 0,
|
|
"maxRetries": 3,
|
|
"createdAt": "2026-02-13T10:25:00Z",
|
|
"startedAt": "2026-02-13T10:30:00Z",
|
|
"completedAt": "2026-02-13T10:45:00Z"
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Create Job
|
|
|
|
```http
|
|
POST /api/media/jobs
|
|
```
|
|
|
|
**Request Body:**
|
|
|
|
```json
|
|
{
|
|
"type": "reencode_streaming",
|
|
"params": {
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"targetBitrate": 2000
|
|
},
|
|
"priority": 5,
|
|
"resourceCategory": "gpu_encode",
|
|
"vramRequired": 4000
|
|
}
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "770e8400-e29b-41d4-a716-446655440002",
|
|
"type": "reencode_streaming",
|
|
"status": "pending",
|
|
"progress": 0,
|
|
"createdAt": "2026-02-13T11:00:00Z"
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Retry Failed Job
|
|
|
|
```http
|
|
POST /api/media/jobs/:id/retry
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"status": "pending",
|
|
"retryCount": 1,
|
|
"retryAfter": null,
|
|
"log": "Starting re-encode...\n[Previous logs...]\n--- RETRY ATTEMPT 1 ---\n"
|
|
}
|
|
```
|
|
|
|
**Retry Logic:**
|
|
|
|
- Failed jobs can be retried up to `maxRetries` times (default: 3)
|
|
- Exponential backoff: wait `2^retryCount` minutes before retry
|
|
- Retry resets status to `pending` and appends retry marker to log
|
|
|
|
---
|
|
|
|
### Cancel Job
|
|
|
|
```http
|
|
POST /api/media/jobs/:id/cancel
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"status": "cancelled",
|
|
"log": "Starting re-encode...\nProgress: 25%\n--- JOB CANCELLED BY ADMIN ---"
|
|
}
|
|
```
|
|
|
|
**Notes:**
|
|
|
|
- Running jobs cannot be cancelled immediately (worker must finish current chunk)
|
|
- Pending/queued jobs cancelled instantly
|
|
|
|
---
|
|
|
|
### Pause/Resume Job
|
|
|
|
```http
|
|
POST /api/media/jobs/:id/pause
|
|
POST /api/media/jobs/:id/resume
|
|
```
|
|
|
|
**Pause Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"status": "paused"
|
|
}
|
|
```
|
|
|
|
**Resume Response:**
|
|
|
|
```json
|
|
{
|
|
"id": "550e8400-e29b-41d4-a716-446655440000",
|
|
"status": "pending"
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
### Queue Statistics
|
|
|
|
```http
|
|
GET /api/media/jobs/stats
|
|
```
|
|
|
|
**Response:**
|
|
|
|
```json
|
|
{
|
|
"pending": 12,
|
|
"queued": 2,
|
|
"running": 3,
|
|
"completed": 1458,
|
|
"failed": 23,
|
|
"paused": 1,
|
|
"totalVramUsed": 12000,
|
|
"totalVramAvailable": 16000,
|
|
"averageProcessingTime": 245,
|
|
"jobsByType": {
|
|
"reencode_streaming": 45,
|
|
"scan": 8,
|
|
"compile_random": 12
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Admin Workflow
|
|
|
|
### Viewing Job Queue
|
|
|
|
1. Navigate to **Media → Jobs** in admin sidebar
|
|
2. Table displays all jobs with:
|
|
- Job type icon
|
|
- Status badge (color-coded)
|
|
- Progress bar
|
|
- Priority indicator
|
|
- Resource category
|
|
- Created/started/completed times
|
|
3. Use filters at top:
|
|
- **Status** dropdown (All / Pending / Running / Completed / Failed)
|
|
- **Type** dropdown (job type)
|
|
- **Resource** dropdown (CPU / GPU Encode / GPU AI)
|
|
|
|
### Creating Jobs Manually
|
|
|
|
**Option 1: From Library Page**
|
|
|
|
1. Select video in library table
|
|
2. Click **"Actions"** dropdown
|
|
3. Select action:
|
|
- "Re-encode for Streaming"
|
|
- "Generate Thumbnail"
|
|
- "Validate Metadata"
|
|
- "Move to Directory"
|
|
4. Confirm job creation
|
|
5. Redirected to Jobs page showing new job
|
|
|
|
**Option 2: From Jobs Page**
|
|
|
|
1. Click **"Create Job"** button
|
|
2. Modal opens with form:
|
|
- **Type** dropdown (15+ job types)
|
|
- **Video** selector (search by title/filename)
|
|
- **Priority** slider (1-10)
|
|
- **Parameters** JSON editor (advanced)
|
|
3. Click **"Create"**
|
|
4. Job appears in pending queue
|
|
|
|
### Monitoring Job Progress
|
|
|
|
**Real-Time Updates:**
|
|
|
|
1. Jobs page polls API every 2 seconds for running jobs
|
|
2. Progress bars update smoothly (0-100%)
|
|
3. Status badges change color:
|
|
- Grey: Pending
|
|
- Blue: Queued
|
|
- Yellow: Running
|
|
- Green: Completed
|
|
- Red: Failed
|
|
|
|
**Detailed Logs:**
|
|
|
|
1. Click job row to expand details panel
|
|
2. View execution log in monospace text area
|
|
3. Log updates in real-time while job running
|
|
4. Example log output:
|
|
|
|
```
|
|
[2026-02-13 10:30:15] Starting re-encode job
|
|
[2026-02-13 10:30:16] Input: /media/local/inbox/original.mp4
|
|
[2026-02-13 10:30:16] Output: /media/local/playback/encoded.mp4
|
|
[2026-02-13 10:30:17] FFmpeg command: ffmpeg -i /media/local/inbox/original.mp4 -c:v libx264 -preset medium -crf 23 -c:a aac -b:a 128k /media/local/playback/encoded.mp4
|
|
[2026-02-13 10:30:20] Progress: 5%
|
|
[2026-02-13 10:30:25] Progress: 15%
|
|
[2026-02-13 10:30:30] Progress: 25%
|
|
...
|
|
[2026-02-13 10:45:00] Progress: 100%
|
|
[2026-02-13 10:45:01] Re-encode completed successfully
|
|
[2026-02-13 10:45:02] Output file size: 25.3 MB
|
|
```
|
|
|
|
### Retrying Failed Jobs
|
|
|
|
1. Filter for **Failed** jobs
|
|
2. Click job row to view error log
|
|
3. Identify failure reason (e.g., "FFmpeg error: codec not supported")
|
|
4. Fix underlying issue (install codec, fix file path, etc.)
|
|
5. Click **"Retry"** button
|
|
6. Job resets to pending status
|
|
7. Worker picks up job again
|
|
|
|
**Auto-Retry:**
|
|
|
|
Jobs automatically retry up to 3 times with exponential backoff:
|
|
|
|
- 1st retry: after 2 minutes
|
|
- 2nd retry: after 4 minutes
|
|
- 3rd retry: after 8 minutes
|
|
|
|
### Cancelling Jobs
|
|
|
|
1. Find job in pending/queued/running state
|
|
2. Click **"Cancel"** button
|
|
3. Confirm cancellation dialog
|
|
4. Job marked as cancelled
|
|
5. If running, worker stops after current chunk completes
|
|
|
|
### Pausing/Resuming Jobs
|
|
|
|
**Use Case:** Temporarily stop low-priority jobs to free resources for urgent tasks
|
|
|
|
1. Select low-priority pending job
|
|
2. Click **"Pause"** button
|
|
3. Job status changes to paused (greyed out)
|
|
4. Worker skips paused jobs
|
|
5. When ready, click **"Resume"**
|
|
6. Job returns to pending queue
|
|
|
|
---
|
|
|
|
## Job Type Details
|
|
|
|
### Scan Jobs (`scan`, `public_scan`)
|
|
|
|
**Purpose:** Scan filesystem directory for new videos and create database records
|
|
|
|
**Parameters:**
|
|
|
|
```json
|
|
{
|
|
"directoryType": "videos",
|
|
"skipExisting": true
|
|
}
|
|
```
|
|
|
|
**Process:**
|
|
|
|
1. Read directory `/media/local/library/{directoryType}/`
|
|
2. Filter for video extensions (`.mp4`, `.mov`, etc.)
|
|
3. Check each file against database (by path)
|
|
4. Create records for new files
|
|
5. Run FFprobe on new files
|
|
6. Update progress: files processed / total files
|
|
|
|
**Typical Duration:** 2-30 seconds (depends on file count)
|
|
|
|
---
|
|
|
|
### Validation Jobs (`validate`)
|
|
|
|
**Purpose:** Re-run FFprobe to refresh video metadata
|
|
|
|
**Parameters:**
|
|
|
|
```json
|
|
{
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001"
|
|
}
|
|
```
|
|
|
|
**Process:**
|
|
|
|
1. Fetch video record from database
|
|
2. Build full file path
|
|
3. Run FFprobe extraction
|
|
4. Update database with fresh metadata
|
|
5. Mark video as valid/invalid based on result
|
|
|
|
**Typical Duration:** 100-500ms per video
|
|
|
|
---
|
|
|
|
### Re-encode Jobs (`reencode_streaming`)
|
|
|
|
**Purpose:** Convert video to web-optimized format (H.264, web-friendly profile)
|
|
|
|
**Parameters:**
|
|
|
|
```json
|
|
{
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"targetBitrate": 2000,
|
|
"preset": "medium",
|
|
"crf": 23
|
|
}
|
|
```
|
|
|
|
**FFmpeg Command:**
|
|
|
|
```bash
|
|
ffmpeg -i /media/local/inbox/original.mp4 \
|
|
-c:v libx264 \
|
|
-preset medium \
|
|
-crf 23 \
|
|
-maxrate 2000k \
|
|
-bufsize 4000k \
|
|
-c:a aac \
|
|
-b:a 128k \
|
|
-movflags +faststart \
|
|
/media/local/playback/encoded.mp4
|
|
```
|
|
|
|
**Process:**
|
|
|
|
1. Validate input file exists
|
|
2. Build FFmpeg command
|
|
3. Start encoding process
|
|
4. Parse FFmpeg progress output
|
|
5. Update job progress every ~5%
|
|
6. Create new video record for encoded file
|
|
7. Update original video `reencodeJobId` reference
|
|
|
|
**Typical Duration:** 5-30 minutes (depends on video length and resolution)
|
|
|
|
---
|
|
|
|
### Compilation Jobs (`compile_random`, `compile_quad`, `compile_mega`)
|
|
|
|
**Purpose:** Merge multiple videos into single compilation
|
|
|
|
**Parameters (Random):**
|
|
|
|
```json
|
|
{
|
|
"count": 10,
|
|
"minDuration": 30,
|
|
"maxDuration": 120,
|
|
"orientation": "landscape",
|
|
"outputPath": "compilations/random-001.mp4"
|
|
}
|
|
```
|
|
|
|
**Process:**
|
|
|
|
1. Query database for videos matching criteria (orientation, duration range)
|
|
2. Randomly select `count` videos
|
|
3. Build FFmpeg concat demuxer file list
|
|
4. Run FFmpeg compilation
|
|
5. Create new video record for compilation
|
|
6. Update progress based on FFmpeg output
|
|
|
|
**Quad Compilation (4-up grid):**
|
|
|
|
```bash
|
|
ffmpeg -i video1.mp4 -i video2.mp4 -i video3.mp4 -i video4.mp4 \
|
|
-filter_complex "[0:v][1:v][2:v][3:v]xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0[v]" \
|
|
-map "[v]" \
|
|
output.mp4
|
|
```
|
|
|
|
**Typical Duration:** 10-60 minutes
|
|
|
|
---
|
|
|
|
### Digest Generation (`digest_generate`)
|
|
|
|
**Purpose:** AI-powered video digest creation (future feature)
|
|
|
|
**Parameters:**
|
|
|
|
```json
|
|
{
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"targetLength": 60,
|
|
"includeHighlights": true
|
|
}
|
|
```
|
|
|
|
**Process (Planned):**
|
|
|
|
1. Extract frames at 1 FPS
|
|
2. Run AI scene detection
|
|
3. Identify highlights (action, faces, motion)
|
|
4. Select best segments totaling target length
|
|
5. Compile segments into digest video
|
|
|
|
**GPU AI Required:** 8GB VRAM
|
|
|
|
---
|
|
|
|
### Thumbnail Generation (`thumbnail_generate`)
|
|
|
|
**Purpose:** Extract thumbnail image from video
|
|
|
|
**Parameters:**
|
|
|
|
```json
|
|
{
|
|
"videoId": "660e8400-e29b-41d4-a716-446655440001",
|
|
"timestamp": 5,
|
|
"width": 640
|
|
}
|
|
```
|
|
|
|
**FFmpeg Command:**
|
|
|
|
```bash
|
|
ffmpeg -i /media/local/library/videos/sample.mp4 \
|
|
-ss 00:00:05 \
|
|
-vframes 1 \
|
|
-vf scale=640:-1 \
|
|
/media/local/thumbnails/sample.jpg
|
|
```
|
|
|
|
**Process:**
|
|
|
|
1. Seek to timestamp (default: 25% into video)
|
|
2. Extract single frame
|
|
3. Scale to width (preserve aspect ratio)
|
|
4. Save as JPEG
|
|
5. Update video record with `thumbnailPath`
|
|
|
|
**Typical Duration:** 1-5 seconds
|
|
|
|
---
|
|
|
|
## Code Examples
|
|
|
|
### Create Re-encode Job
|
|
|
|
```typescript
|
|
// api/src/modules/media/routes/jobs.routes.ts
|
|
import { db } from '@/modules/media/db';
|
|
import { jobs, videos } from '@/modules/media/db/schema';
|
|
|
|
app.post('/api/media/jobs/reencode', async (req, reply) => {
|
|
const { videoId, targetBitrate = 2000, preset = 'medium', crf = 23 } = req.body;
|
|
|
|
// Fetch video
|
|
const [video] = await db
|
|
.select()
|
|
.from(videos)
|
|
.where(eq(videos.id, videoId))
|
|
.limit(1);
|
|
|
|
if (!video) {
|
|
return reply.code(404).send({ error: 'Video not found' });
|
|
}
|
|
|
|
// Create job
|
|
const [job] = await db
|
|
.insert(jobs)
|
|
.values({
|
|
type: 'reencode_streaming',
|
|
status: 'pending',
|
|
params: {
|
|
videoId,
|
|
inputPath: video.path,
|
|
outputPath: `playback/${video.filename}`,
|
|
targetBitrate,
|
|
preset,
|
|
crf,
|
|
},
|
|
resourceCategory: 'gpu_encode',
|
|
vramRequired: 4000,
|
|
priority: 5,
|
|
})
|
|
.returning();
|
|
|
|
reply.send(job);
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
### Job Worker (Polling Loop)
|
|
|
|
```typescript
|
|
// api/src/modules/media/services/job-worker.service.ts
|
|
import { db } from '@/modules/media/db';
|
|
import { jobs } from '@/modules/media/db/schema';
|
|
import { eq, and, lte } from 'drizzle-orm';
|
|
|
|
export class JobWorkerService {
|
|
private polling = false;
|
|
|
|
async start() {
|
|
this.polling = true;
|
|
console.log('Job worker started');
|
|
|
|
while (this.polling) {
|
|
try {
|
|
await this.processNextJob();
|
|
} catch (error) {
|
|
console.error('Job worker error:', error);
|
|
}
|
|
|
|
// Wait 5 seconds before next poll
|
|
await new Promise((resolve) => setTimeout(resolve, 5000));
|
|
}
|
|
}
|
|
|
|
async stop() {
|
|
this.polling = false;
|
|
console.log('Job worker stopped');
|
|
}
|
|
|
|
private async processNextJob() {
|
|
// Find next pending job (highest priority first)
|
|
const [job] = await db
|
|
.select()
|
|
.from(jobs)
|
|
.where(eq(jobs.status, 'pending'))
|
|
.orderBy(jobs.priority, jobs.createdAt)
|
|
.limit(1);
|
|
|
|
if (!job) {
|
|
return; // No jobs in queue
|
|
}
|
|
|
|
// Check resource availability
|
|
const canRun = await this.checkResources(job);
|
|
if (!canRun) {
|
|
// Update waiting reason
|
|
await db
|
|
.update(jobs)
|
|
.set({ waitingReason: 'Insufficient resources' })
|
|
.where(eq(jobs.id, job.id));
|
|
return;
|
|
}
|
|
|
|
// Start job
|
|
await this.executeJob(job);
|
|
}
|
|
|
|
private async checkResources(job: any): Promise<boolean> {
|
|
// Get running jobs
|
|
const runningJobs = await db
|
|
.select()
|
|
.from(jobs)
|
|
.where(eq(jobs.status, 'running'));
|
|
|
|
// Calculate total VRAM used
|
|
const totalVramUsed = runningJobs.reduce(
|
|
(sum, j) => sum + (j.vramRequired || 0),
|
|
0
|
|
);
|
|
|
|
const TOTAL_VRAM = 16000; // 16GB GPU
|
|
const available = TOTAL_VRAM - totalVramUsed;
|
|
|
|
if (job.vramRequired && job.vramRequired > available) {
|
|
return false; // Not enough VRAM
|
|
}
|
|
|
|
// Check concurrent job limits by category
|
|
const categoryCount = runningJobs.filter(
|
|
(j) => j.resourceCategory === job.resourceCategory
|
|
).length;
|
|
|
|
const limits = {
|
|
cpu: 5,
|
|
gpu_encode: 2,
|
|
gpu_ai: 1,
|
|
};
|
|
|
|
if (categoryCount >= limits[job.resourceCategory as keyof typeof limits]) {
|
|
return false; // Category limit reached
|
|
}
|
|
|
|
return true; // Resources available
|
|
}
|
|
|
|
private async executeJob(job: any) {
|
|
// Mark as running
|
|
await db
|
|
.update(jobs)
|
|
.set({
|
|
status: 'running',
|
|
startedAt: new Date(),
|
|
waitingReason: null,
|
|
})
|
|
.where(eq(jobs.id, job.id));
|
|
|
|
try {
|
|
// Execute job based on type
|
|
switch (job.type) {
|
|
case 'reencode_streaming':
|
|
await this.executeReencode(job);
|
|
break;
|
|
case 'scan':
|
|
await this.executeScan(job);
|
|
break;
|
|
case 'thumbnail_generate':
|
|
await this.executeThumbnail(job);
|
|
break;
|
|
// ... other job types
|
|
}
|
|
|
|
// Mark as completed
|
|
await db
|
|
.update(jobs)
|
|
.set({
|
|
status: 'completed',
|
|
progress: 100,
|
|
completedAt: new Date(),
|
|
})
|
|
.where(eq(jobs.id, job.id));
|
|
} catch (error: any) {
|
|
// Mark as failed
|
|
await db
|
|
.update(jobs)
|
|
.set({
|
|
status: 'failed',
|
|
log: (job.log || '') + `\n\n--- ERROR ---\n${error.message}`,
|
|
})
|
|
.where(eq(jobs.id, job.id));
|
|
|
|
// Schedule retry if under max retries
|
|
if (job.retryCount < job.maxRetries) {
|
|
const retryDelay = Math.pow(2, job.retryCount) * 60 * 1000; // Exponential backoff
|
|
await db
|
|
.update(jobs)
|
|
.set({
|
|
status: 'pending',
|
|
retryCount: job.retryCount + 1,
|
|
retryAfter: new Date(Date.now() + retryDelay),
|
|
})
|
|
.where(eq(jobs.id, job.id));
|
|
}
|
|
}
|
|
}
|
|
|
|
private async executeReencode(job: any) {
|
|
const { inputPath, outputPath, targetBitrate, preset, crf } = job.params;
|
|
|
|
const inputFull = path.join(process.env.MEDIA_LIBRARY_PATH!, inputPath);
|
|
const outputFull = path.join(process.env.MEDIA_LIBRARY_PATH!, outputPath);
|
|
|
|
const command = `ffmpeg -i "${inputFull}" -c:v libx264 -preset ${preset} -crf ${crf} -maxrate ${targetBitrate}k -bufsize ${targetBitrate * 2}k -c:a aac -b:a 128k -movflags +faststart "${outputFull}"`;
|
|
|
|
await this.appendLog(job.id, `Starting re-encode\nCommand: ${command}`);
|
|
|
|
// Execute FFmpeg (simplified - real implementation uses spawn for progress parsing)
|
|
await execAsync(command);
|
|
|
|
await this.appendLog(job.id, 'Re-encode completed successfully');
|
|
}
|
|
|
|
private async appendLog(jobId: string, message: string) {
|
|
const timestamp = new Date().toISOString();
|
|
const logEntry = `[${timestamp}] ${message}`;
|
|
|
|
await db
|
|
.update(jobs)
|
|
.set({
|
|
log: sql`${jobs.log} || E'\n' || ${logEntry}`,
|
|
})
|
|
.where(eq(jobs.id, jobId));
|
|
}
|
|
}
|
|
|
|
// Start worker
|
|
export const jobWorker = new JobWorkerService();
|
|
jobWorker.start();
|
|
```
|
|
|
|
---
|
|
|
|
### Frontend: Jobs Page
|
|
|
|
```typescript
|
|
// admin/src/pages/media/MediaJobsPage.tsx
|
|
import { Table, Tag, Progress, Button, Space, Select, message } from 'antd';
|
|
import { useEffect, useState } from 'react';
|
|
import { mediaApi } from '@/lib/media-api';
|
|
|
|
export default function MediaJobsPage() {
|
|
const [jobs, setJobs] = useState([]);
|
|
const [loading, setLoading] = useState(false);
|
|
const [filter, setFilter] = useState({ status: undefined, type: undefined });
|
|
const [polling, setPolling] = useState(true);
|
|
|
|
const fetchJobs = async () => {
|
|
setLoading(true);
|
|
try {
|
|
const { data } = await mediaApi.get('/api/media/jobs', {
|
|
params: filter,
|
|
});
|
|
setJobs(data.data);
|
|
} catch (error) {
|
|
console.error('Failed to fetch jobs:', error);
|
|
} finally {
|
|
setLoading(false);
|
|
}
|
|
};
|
|
|
|
useEffect(() => {
|
|
fetchJobs();
|
|
}, [filter]);
|
|
|
|
// Poll for running jobs every 2 seconds
|
|
useEffect(() => {
|
|
if (!polling) return;
|
|
|
|
const interval = setInterval(() => {
|
|
const hasRunning = jobs.some((j: any) => j.status === 'running');
|
|
if (hasRunning) {
|
|
fetchJobs();
|
|
}
|
|
}, 2000);
|
|
|
|
return () => clearInterval(interval);
|
|
}, [polling, jobs]);
|
|
|
|
const handleRetry = async (id: string) => {
|
|
try {
|
|
await mediaApi.post(`/api/media/jobs/${id}/retry`);
|
|
message.success('Job queued for retry');
|
|
fetchJobs();
|
|
} catch (error) {
|
|
message.error('Retry failed');
|
|
}
|
|
};
|
|
|
|
const handleCancel = async (id: string) => {
|
|
try {
|
|
await mediaApi.post(`/api/media/jobs/${id}/cancel`);
|
|
message.success('Job cancelled');
|
|
fetchJobs();
|
|
} catch (error) {
|
|
message.error('Cancel failed');
|
|
}
|
|
};
|
|
|
|
const statusColors: Record<string, string> = {
|
|
pending: 'default',
|
|
queued: 'blue',
|
|
running: 'processing',
|
|
completed: 'success',
|
|
failed: 'error',
|
|
cancelled: 'default',
|
|
paused: 'warning',
|
|
};
|
|
|
|
const columns = [
|
|
{
|
|
title: 'Type',
|
|
dataIndex: 'type',
|
|
width: 150,
|
|
render: (type: string) => <span style={{ fontFamily: 'monospace' }}>{type}</span>,
|
|
},
|
|
{
|
|
title: 'Status',
|
|
dataIndex: 'status',
|
|
width: 100,
|
|
render: (status: string) => <Tag color={statusColors[status]}>{status.toUpperCase()}</Tag>,
|
|
},
|
|
{
|
|
title: 'Progress',
|
|
dataIndex: 'progress',
|
|
width: 150,
|
|
render: (progress: number, record: any) => (
|
|
record.status === 'running' ? (
|
|
<Progress percent={progress} size="small" status="active" />
|
|
) : record.status === 'completed' ? (
|
|
<Progress percent={100} size="small" status="success" />
|
|
) : record.status === 'failed' ? (
|
|
<Progress percent={progress} size="small" status="exception" />
|
|
) : (
|
|
<Progress percent={progress} size="small" />
|
|
)
|
|
),
|
|
},
|
|
{
|
|
title: 'Resource',
|
|
dataIndex: 'resourceCategory',
|
|
width: 120,
|
|
},
|
|
{
|
|
title: 'Priority',
|
|
dataIndex: 'priority',
|
|
width: 80,
|
|
render: (priority: number) => (
|
|
<Tag color={priority <= 3 ? 'red' : priority <= 6 ? 'orange' : 'default'}>
|
|
{priority}
|
|
</Tag>
|
|
),
|
|
},
|
|
{
|
|
title: 'Created',
|
|
dataIndex: 'createdAt',
|
|
width: 150,
|
|
render: (date: string) => new Date(date).toLocaleString(),
|
|
},
|
|
{
|
|
title: 'Actions',
|
|
width: 200,
|
|
render: (_: any, record: any) => (
|
|
<Space>
|
|
{record.status === 'failed' && (
|
|
<Button size="small" onClick={() => handleRetry(record.id)}>
|
|
Retry
|
|
</Button>
|
|
)}
|
|
{['pending', 'queued', 'running'].includes(record.status) && (
|
|
<Button size="small" danger onClick={() => handleCancel(record.id)}>
|
|
Cancel
|
|
</Button>
|
|
)}
|
|
<Button size="small" onClick={() => window.open(`/app/media/jobs/${record.id}`, '_blank')}>
|
|
View Log
|
|
</Button>
|
|
</Space>
|
|
),
|
|
},
|
|
];
|
|
|
|
return (
|
|
<div>
|
|
<Space style={{ marginBottom: 16 }}>
|
|
<Select
|
|
placeholder="Filter by status"
|
|
style={{ width: 150 }}
|
|
onChange={(value) => setFilter({ ...filter, status: value })}
|
|
allowClear
|
|
>
|
|
<Select.Option value="pending">Pending</Select.Option>
|
|
<Select.Option value="running">Running</Select.Option>
|
|
<Select.Option value="completed">Completed</Select.Option>
|
|
<Select.Option value="failed">Failed</Select.Option>
|
|
</Select>
|
|
|
|
<Select
|
|
placeholder="Filter by type"
|
|
style={{ width: 200 }}
|
|
onChange={(value) => setFilter({ ...filter, type: value })}
|
|
allowClear
|
|
>
|
|
<Select.Option value="scan">Scan</Select.Option>
|
|
<Select.Option value="reencode_streaming">Re-encode</Select.Option>
|
|
<Select.Option value="compile_random">Compilation</Select.Option>
|
|
</Select>
|
|
|
|
<Button onClick={() => setPolling(!polling)}>
|
|
{polling ? 'Stop Auto-Refresh' : 'Start Auto-Refresh'}
|
|
</Button>
|
|
</Space>
|
|
|
|
<Table
|
|
columns={columns}
|
|
dataSource={jobs}
|
|
loading={loading}
|
|
rowKey="id"
|
|
pagination={{ pageSize: 20 }}
|
|
/>
|
|
</div>
|
|
);
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Problem: Jobs Stuck in Pending
|
|
|
|
**Symptoms:**
|
|
|
|
- Jobs created but never start
|
|
- Status remains "pending" for hours
|
|
- No "running" jobs visible
|
|
|
|
**Solutions:**
|
|
|
|
1. **Check worker process running:**
|
|
|
|
```bash
|
|
docker compose ps media-api
|
|
# Should show "Up" status
|
|
|
|
docker compose logs media-api | grep "Job worker"
|
|
# Should show "Job worker started"
|
|
```
|
|
|
|
2. **Manually trigger worker:**
|
|
|
|
```bash
|
|
# Restart media-api container
|
|
docker compose restart media-api
|
|
|
|
# Worker starts automatically on container boot
|
|
```
|
|
|
|
3. **Check worker logs for errors:**
|
|
|
|
```bash
|
|
docker compose logs -f media-api | grep ERROR
|
|
# Look for database connection errors, permission issues
|
|
```
|
|
|
|
4. **Verify database connection:**
|
|
|
|
```bash
|
|
# Test database accessible from container
|
|
docker compose exec media-api psql $DATABASE_URL -c "SELECT COUNT(*) FROM jobs WHERE status='pending';"
|
|
```
|
|
|
|
---
|
|
|
|
### Problem: Job Fails Immediately
|
|
|
|
**Symptoms:**
|
|
|
|
- Job status changes from pending → running → failed within seconds
|
|
- No meaningful progress
|
|
- Error in log: "Command not found" or "Permission denied"
|
|
|
|
**Solutions:**
|
|
|
|
1. **Check job log in database:**
|
|
|
|
```sql
|
|
SELECT log FROM jobs WHERE id = 'JOB_ID';
|
|
```
|
|
|
|
2. **Verify FFmpeg installed:**
|
|
|
|
```bash
|
|
docker compose exec media-api which ffmpeg
|
|
# Should output: /usr/bin/ffmpeg
|
|
|
|
docker compose exec media-api ffmpeg -version
|
|
```
|
|
|
|
3. **Check file paths valid:**
|
|
|
|
```bash
|
|
# Verify input file exists
|
|
docker compose exec media-api ls -la /media/local/library/inbox/original.mp4
|
|
|
|
# Check output directory writable
|
|
docker compose exec media-api touch /media/local/playback/test.txt
|
|
```
|
|
|
|
4. **Test FFmpeg command manually:**
|
|
|
|
```bash
|
|
# Copy command from job log, run manually
|
|
docker compose exec media-api ffmpeg -i /media/local/inbox/test.mp4 -c:v libx264 /media/local/playback/test-output.mp4
|
|
```
|
|
|
|
---
|
|
|
|
### Problem: Re-encode Job Hangs at Same Progress
|
|
|
|
**Symptoms:**
|
|
|
|
- Job progress reaches 25%, 50%, or 75% then stops updating
|
|
- Status remains "running" for hours
|
|
- No CPU/GPU activity visible
|
|
|
|
**Solutions:**
|
|
|
|
1. **Check FFmpeg process still running:**
|
|
|
|
```bash
|
|
docker compose exec media-api ps aux | grep ffmpeg
|
|
# Should show ffmpeg process
|
|
|
|
# If not running, worker crashed
|
|
docker compose logs media-api --tail 100
|
|
```
|
|
|
|
2. **Kill hung FFmpeg process:**
|
|
|
|
```bash
|
|
docker compose exec media-api pkill -9 ffmpeg
|
|
|
|
# Job will fail and can be retried
|
|
```
|
|
|
|
3. **Check disk space:**
|
|
|
|
```bash
|
|
df -h /media/local/playback
|
|
# If 100% full, encoding fails
|
|
|
|
# Free space
|
|
docker compose exec media-api rm /media/local/playback/*.partial
|
|
```
|
|
|
|
4. **Increase FFmpeg timeout (if very large file):**
|
|
|
|
```typescript
|
|
// api/src/modules/media/services/job-worker.service.ts
|
|
const FFMPEG_TIMEOUT = 3600000; // 1 hour (from 30 minutes)
|
|
```
|
|
|
|
---
|
|
|
|
### Problem: GPU Out of Memory Errors
|
|
|
|
**Symptoms:**
|
|
|
|
- Multiple GPU jobs running simultaneously
|
|
- Error in log: "CUDA out of memory" or "Cannot allocate memory"
|
|
- System becomes unresponsive
|
|
|
|
**Solutions:**
|
|
|
|
1. **Check total VRAM available:**
|
|
|
|
```bash
|
|
nvidia-smi
|
|
# Shows GPU memory usage
|
|
|
|
# Should show < 16GB used (adjust based on your GPU)
|
|
```
|
|
|
|
2. **Reduce concurrent GPU job limit:**
|
|
|
|
```typescript
|
|
// api/src/modules/media/services/job-worker.service.ts
|
|
const limits = {
|
|
cpu: 5,
|
|
gpu_encode: 1, // Reduced from 2
|
|
gpu_ai: 1,
|
|
};
|
|
```
|
|
|
|
3. **Increase VRAM requirements for jobs:**
|
|
|
|
```typescript
|
|
// Jobs require more VRAM than specified
|
|
// Update job creation to use higher vramRequired values
|
|
{
|
|
type: 'reencode_streaming',
|
|
vramRequired: 6000, // Increased from 4000
|
|
}
|
|
```
|
|
|
|
4. **Kill running GPU jobs:**
|
|
|
|
```bash
|
|
# Stop all media jobs
|
|
docker compose exec media-api pkill -9 ffmpeg
|
|
|
|
# Update stuck jobs to failed status
|
|
docker compose exec v2-postgres psql -U changemaker -d v2_changemaker \
|
|
-c "UPDATE jobs SET status='failed' WHERE status='running';"
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Considerations
|
|
|
|
### Job Queue Throughput
|
|
|
|
**Scaling Factors:**
|
|
|
|
- CPU jobs: 5 concurrent = ~10-20 jobs/minute (scans, validations)
|
|
- GPU encode: 2 concurrent = ~4-8 videos/hour (depends on length)
|
|
- GPU AI: 1 concurrent = ~2-6 videos/hour (depends on complexity)
|
|
|
|
**Bottlenecks:**
|
|
|
|
1. **GPU Memory** — Limits concurrent GPU jobs
|
|
2. **Disk I/O** — Reading/writing large video files
|
|
3. **CPU** — FFmpeg encoding uses all available cores
|
|
|
|
**Optimization:**
|
|
|
|
- **Distribute workers across multiple machines** — Each machine runs separate worker process
|
|
- **Use job priority** — Urgent jobs (priority 1-3) run first
|
|
- **Batch similar jobs** — Group scan jobs, re-encode jobs, etc. for efficiency
|
|
|
|
---
|
|
|
|
### Database Performance
|
|
|
|
**Job Queue Index:**
|
|
|
|
```sql
|
|
CREATE INDEX idx_jobs_status_priority ON jobs(status, priority, created_at);
|
|
```
|
|
|
|
**Query Performance:**
|
|
|
|
- Find next pending job: ~1-5ms (with index)
|
|
- Update job status: ~2-10ms
|
|
- Fetch job logs: ~5-20ms
|
|
|
|
**Optimization:**
|
|
|
|
- **Partition jobs table by date** — Move old completed/failed jobs to archive table
|
|
- **Limit log size** — Truncate logs > 10KB to prevent bloat
|
|
|
|
---
|
|
|
|
## Monitoring & Observability
|
|
|
|
### Prometheus Metrics
|
|
|
|
```typescript
|
|
// api/src/utils/metrics.ts
|
|
import { Counter, Gauge } from 'prom-client';
|
|
|
|
export const mediaJobsTotal = new Counter({
|
|
name: 'media_jobs_total',
|
|
help: 'Total media jobs created',
|
|
labelNames: ['type', 'status'],
|
|
});
|
|
|
|
export const mediaJobsPending = new Gauge({
|
|
name: 'media_jobs_pending',
|
|
help: 'Number of pending media jobs',
|
|
});
|
|
|
|
export const mediaJobsRunning = new Gauge({
|
|
name: 'media_jobs_running',
|
|
help: 'Number of running media jobs',
|
|
labelNames: ['resourceCategory'],
|
|
});
|
|
|
|
export const mediaVramUsed = new Gauge({
|
|
name: 'media_vram_used_mb',
|
|
help: 'Total VRAM used by running jobs (MB)',
|
|
});
|
|
|
|
// Update metrics in worker
|
|
mediaJobsPending.set(pendingCount);
|
|
mediaJobsRunning.set({ resourceCategory: 'gpu_encode' }, gpuEncodeCount);
|
|
mediaVramUsed.set(totalVramUsed);
|
|
```
|
|
|
|
### Grafana Dashboard Panel
|
|
|
|
**Job Queue Status:**
|
|
|
|
```promql
|
|
# Pending jobs count
|
|
media_jobs_pending
|
|
|
|
# Running jobs by category
|
|
sum(media_jobs_running) by (resourceCategory)
|
|
|
|
# VRAM usage percentage
|
|
(media_vram_used_mb / 16000) * 100
|
|
```
|
|
|
|
**Alert Rules:**
|
|
|
|
```yaml
|
|
# configs/prometheus/alerts.yml
|
|
groups:
|
|
- name: media_jobs
|
|
rules:
|
|
- alert: MediaJobQueueBacklog
|
|
expr: media_jobs_pending > 50
|
|
for: 30m
|
|
labels:
|
|
severity: warning
|
|
annotations:
|
|
summary: "Media job queue backlog"
|
|
description: "{{ $value }} jobs pending for 30+ minutes"
|
|
|
|
- alert: MediaJobsStuckRunning
|
|
expr: sum(media_jobs_running) == 0 AND media_jobs_pending > 0
|
|
for: 10m
|
|
labels:
|
|
severity: critical
|
|
annotations:
|
|
summary: "Media jobs stuck"
|
|
description: "Jobs pending but worker not processing"
|
|
```
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
### Backend Documentation
|
|
|
|
- **Job Worker:** `backend/modules/media/job-worker.md` — Worker process implementation
|
|
- **Job Processors:** `backend/modules/media/processors/` — Individual job type processors (reencode, scan, etc.)
|
|
- **Jobs Routes:** `backend/modules/media/jobs.md` — API endpoints for job management
|
|
|
|
### Frontend Documentation
|
|
|
|
- **Jobs Page:** `frontend/pages/media/jobs.md` — Job queue monitoring UI
|
|
- **Job Detail Modal:** `frontend/components/media/job-detail.md` — Log viewer component
|
|
|
|
### Feature Documentation
|
|
|
|
- **Video Library:** `features/media/video-library.md` — Triggering jobs from library actions
|
|
- **Upload System:** `features/media/upload.md` — Post-upload job creation
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
After mastering the job queue:
|
|
|
|
1. **Create Custom Jobs** — Implement new job types for domain-specific processing
|
|
2. **Optimize Scheduling** — Tune resource limits and priority settings for your workload
|
|
3. **Monitor Performance** — Set up Grafana dashboards and alerts for job queue health
|
|
4. **Distributed Workers** — Scale horizontally by running workers on multiple machines
|
|
|
|
**Hands-On Practice:**
|
|
|
|
```bash
|
|
# 1. Create re-encode job
|
|
curl -X POST http://localhost:4100/api/media/jobs \
|
|
-H "Authorization: Bearer YOUR_TOKEN" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"type": "reencode_streaming",
|
|
"params": { "videoId": "VIDEO_ID", "targetBitrate": 2000 },
|
|
"priority": 5
|
|
}'
|
|
|
|
# 2. Monitor job progress
|
|
watch -n 2 'curl -s http://localhost:4100/api/media/jobs/JOB_ID | jq ".progress"'
|
|
|
|
# 3. View job logs
|
|
curl http://localhost:4100/api/media/jobs/JOB_ID | jq -r ".log"
|
|
|
|
# 4. Check queue stats
|
|
curl http://localhost:4100/api/media/jobs/stats | jq
|
|
```
|
|
|
|
---
|
|
|
|
**Last Updated:** 2026-02-13
|
|
**Version:** V2.0
|
|
**Maintainer:** Changemaker Lite Team
|