Initial commit: Live Captions web application
Real-time speech-to-text using OpenAI Whisper (faster-whisper). Features browser audio capture, WebSocket streaming, and customizable display settings. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
commit
c7becf330c
33
.env.example
Normal file
33
.env.example
Normal file
@ -0,0 +1,33 @@
|
||||
# Server settings
|
||||
HOST=0.0.0.0
|
||||
PORT=5000
|
||||
DEBUG=false
|
||||
|
||||
# Whisper settings
|
||||
WHISPER_MODEL=base
|
||||
# Device: cpu or cuda (for NVIDIA GPU)
|
||||
WHISPER_DEVICE=cpu
|
||||
# Compute type:
|
||||
# CPU: int8 (fastest), float32
|
||||
# GPU: float16 (recommended), int8_float16, float32
|
||||
WHISPER_COMPUTE_TYPE=int8
|
||||
|
||||
# Audio settings
|
||||
AUDIO_CHUNK_DURATION=3
|
||||
AUDIO_SAMPLE_RATE=16000
|
||||
|
||||
# Database
|
||||
DATABASE_PATH=data/settings.db
|
||||
|
||||
# =============================================================================
|
||||
# GPU Configuration (optional)
|
||||
# =============================================================================
|
||||
# To enable NVIDIA GPU support:
|
||||
# 1. Install NVIDIA Container Toolkit (see CLAUDE.md for instructions)
|
||||
# 2. Set WHISPER_DEVICE=cuda
|
||||
# 3. Set WHISPER_COMPUTE_TYPE=float16 (recommended for GPU)
|
||||
# 4. Run with: docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build
|
||||
#
|
||||
# Example GPU settings:
|
||||
# WHISPER_DEVICE=cuda
|
||||
# WHISPER_COMPUTE_TYPE=float16
|
||||
28
.gitignore
vendored
Normal file
28
.gitignore
vendored
Normal file
@ -0,0 +1,28 @@
|
||||
# Environment
|
||||
.env
|
||||
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
venv/
|
||||
ENV/
|
||||
|
||||
# Data
|
||||
data/
|
||||
recordings/
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Whisper models cache (if running locally)
|
||||
.cache/
|
||||
154
CLAUDE.md
Normal file
154
CLAUDE.md
Normal file
@ -0,0 +1,154 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
Live Captions is a Dockerized web application that provides real-time speech-to-text captions using OpenAI's Whisper model (via faster-whisper). It captures microphone audio in the browser, streams it to a Flask backend for transcription, and displays captions with customizable styling.
|
||||
|
||||
## Commands
|
||||
|
||||
### Development
|
||||
```bash
|
||||
# Build and run (primary development command)
|
||||
docker compose up --build
|
||||
|
||||
# Run in background
|
||||
docker compose up -d --build
|
||||
|
||||
# View logs
|
||||
docker compose logs -f
|
||||
|
||||
# Stop
|
||||
docker compose down
|
||||
|
||||
# Reset all data (database + cached models)
|
||||
docker compose down -v
|
||||
```
|
||||
|
||||
### First-time setup
|
||||
```bash
|
||||
cp .env.example .env
|
||||
docker compose up --build
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Browser Docker Container
|
||||
┌─────────────────────┐ ┌─────────────────────────────┐
|
||||
│ MediaRecorder API │ │ Flask + Flask-SocketIO │
|
||||
│ (1.5s audio chunks)│ ──────► │ (app.py) │
|
||||
│ │ WebSocket│ │ │
|
||||
│ Caption Display │ ◄────── │ faster-whisper transcriber │
|
||||
│ (word-by-word) │ │ (transcriber.py) │
|
||||
│ │ │ │ │
|
||||
│ Settings Panel │ ──────► │ SQLite settings persistence│
|
||||
│ │ REST API│ (database.py) │
|
||||
└─────────────────────┘ └─────────────────────────────┘
|
||||
```
|
||||
|
||||
### Data Flow
|
||||
1. Browser captures mic audio using MediaRecorder, sends base64-encoded WebM chunks every 1.5s via WebSocket
|
||||
2. Backend converts WebM→WAV using pydub/ffmpeg, transcribes with faster-whisper
|
||||
3. Transcribed text sent back via WebSocket `transcription` event
|
||||
4. Frontend animates words appearing one-by-one for streaming effect
|
||||
|
||||
### Key Files
|
||||
- **app.py**: Flask server with SocketIO WebSocket handlers and REST API for settings
|
||||
- **transcriber.py**: Whisper model loading and audio transcription (singleton model instance)
|
||||
- **database.py**: SQLite CRUD for user display preferences
|
||||
- **static/js/app.js**: Audio capture, WebSocket client, word animation queue
|
||||
- **static/js/settings.js**: Settings panel UI and persistence
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables in `.env`:
|
||||
- `WHISPER_MODEL`: Model size (tiny/base/small/medium/large) - affects accuracy vs speed
|
||||
- `WHISPER_DEVICE`: cpu or cuda
|
||||
- `WHISPER_COMPUTE_TYPE`: int8/float16/float32
|
||||
|
||||
User display settings stored in SQLite (`data/settings.db`):
|
||||
- Font family, size, weight, color
|
||||
- Background color, opacity, border radius, padding
|
||||
- Max words (controls caption buffer length)
|
||||
|
||||
## API Endpoints
|
||||
|
||||
| Endpoint | Method | Purpose |
|
||||
|----------|--------|---------|
|
||||
| `/` | GET | Main UI |
|
||||
| `/api/health` | GET | Health check |
|
||||
| `/api/settings` | GET/PUT | Read/update user settings |
|
||||
| `/api/settings/reset` | POST | Reset to defaults |
|
||||
|
||||
## WebSocket Events
|
||||
|
||||
| Event | Direction | Payload |
|
||||
|-------|-----------|---------|
|
||||
| `audio_data` | client→server | `{audio: base64, format: 'webm'}` |
|
||||
| `transcription` | server→client | `{text: string}` |
|
||||
| `settings_updated` | server→client | settings object |
|
||||
|
||||
## Volumes
|
||||
|
||||
- `./data:/app/data` - SQLite database persistence
|
||||
- `whisper-models` - Cached Whisper model files (~140MB for base)
|
||||
|
||||
## NVIDIA GPU Support
|
||||
|
||||
GPU acceleration significantly improves transcription speed. Follow these steps to enable it.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. NVIDIA GPU with CUDA support
|
||||
2. NVIDIA driver installed (`nvidia-smi` should work)
|
||||
3. Docker installed
|
||||
|
||||
### Install NVIDIA Container Toolkit
|
||||
|
||||
```bash
|
||||
# Add NVIDIA package repository
|
||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
||||
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
||||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
||||
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
|
||||
# Install the toolkit
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y nvidia-container-toolkit
|
||||
|
||||
# Configure Docker to use NVIDIA runtime
|
||||
sudo nvidia-ctk runtime configure --runtime=docker
|
||||
sudo systemctl restart docker
|
||||
|
||||
# Verify installation
|
||||
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
|
||||
```
|
||||
|
||||
### Configure for GPU
|
||||
|
||||
1. Update `.env`:
|
||||
```env
|
||||
WHISPER_DEVICE=cuda
|
||||
WHISPER_COMPUTE_TYPE=float16
|
||||
```
|
||||
|
||||
2. Run with GPU support:
|
||||
```bash
|
||||
docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build
|
||||
```
|
||||
|
||||
### GPU Compute Types
|
||||
|
||||
| Type | Speed | Memory | Notes |
|
||||
|------|-------|--------|-------|
|
||||
| `float16` | Fast | Medium | Recommended for most GPUs |
|
||||
| `int8_float16` | Faster | Lower | Good balance |
|
||||
| `float32` | Slower | Higher | Maximum precision |
|
||||
|
||||
### Troubleshooting
|
||||
|
||||
- **"could not select device driver"**: NVIDIA Container Toolkit not installed or Docker not restarted
|
||||
- **CUDA out of memory**: Try a smaller model (`WHISPER_MODEL=small` or `tiny`)
|
||||
- **Verify GPU access**: `docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi`
|
||||
45
Dockerfile
Normal file
45
Dockerfile
Normal file
@ -0,0 +1,45 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
ffmpeg \
|
||||
libsndfile1 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Create app directory
|
||||
WORKDIR /app
|
||||
|
||||
# Create non-root user
|
||||
RUN useradd -m -u 1000 appuser
|
||||
|
||||
# Copy requirements first for better caching
|
||||
COPY requirements.txt .
|
||||
|
||||
# Install Python dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy application code
|
||||
COPY . .
|
||||
|
||||
# Create data and recordings directories
|
||||
RUN mkdir -p /app/data /app/recordings && chown -R appuser:appuser /app
|
||||
|
||||
# Create directory for Whisper models cache
|
||||
RUN mkdir -p /home/appuser/.cache/huggingface && chown -R appuser:appuser /home/appuser
|
||||
|
||||
# Switch to non-root user
|
||||
USER appuser
|
||||
|
||||
# Expose port
|
||||
EXPOSE 5000
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
||||
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:5000/api/health')" || exit 1
|
||||
|
||||
# Run the application
|
||||
CMD ["python", "app.py"]
|
||||
54
Dockerfile.gpu
Normal file
54
Dockerfile.gpu
Normal file
@ -0,0 +1,54 @@
|
||||
# GPU-enabled Dockerfile for NVIDIA CUDA support
|
||||
FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
|
||||
|
||||
# Set environment variables
|
||||
ENV PYTHONDONTWRITEBYTECODE=1
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
|
||||
# Install Python and system dependencies
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
python3.11 \
|
||||
python3.11-venv \
|
||||
python3-pip \
|
||||
ffmpeg \
|
||||
libsndfile1 \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Set Python 3.11 as default
|
||||
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3.11 1 \
|
||||
&& update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 1
|
||||
|
||||
# Create app directory
|
||||
WORKDIR /app
|
||||
|
||||
# Create non-root user
|
||||
RUN useradd -m -u 1000 appuser
|
||||
|
||||
# Copy requirements first for better caching
|
||||
COPY requirements.txt .
|
||||
|
||||
# Install Python dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy application code
|
||||
COPY . .
|
||||
|
||||
# Create data and recordings directories
|
||||
RUN mkdir -p /app/data /app/recordings && chown -R appuser:appuser /app
|
||||
|
||||
# Create directory for Whisper models cache
|
||||
RUN mkdir -p /home/appuser/.cache/huggingface && chown -R appuser:appuser /home/appuser
|
||||
|
||||
# Switch to non-root user
|
||||
USER appuser
|
||||
|
||||
# Expose port
|
||||
EXPOSE 5000
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
||||
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:5000/api/health')" || exit 1
|
||||
|
||||
# Run the application
|
||||
CMD ["python", "app.py"]
|
||||
3
README.MD
Normal file
3
README.MD
Normal file
@ -0,0 +1,3 @@
|
||||
Live Captions
|
||||
|
||||
Live captions is a project to display live captions on screen in a small customizable browser window entirely locally.
|
||||
235
app.py
Normal file
235
app.py
Normal file
@ -0,0 +1,235 @@
|
||||
"""
|
||||
Live Captions - Flask Application
|
||||
|
||||
A web-based live captioning application using Whisper for speech recognition.
|
||||
"""
|
||||
|
||||
import os
|
||||
import logging
|
||||
from datetime import datetime
|
||||
|
||||
from flask import Flask, render_template, jsonify, request
|
||||
from flask_socketio import SocketIO, emit
|
||||
from dotenv import load_dotenv
|
||||
|
||||
import database
|
||||
import transcriber
|
||||
import recordings
|
||||
|
||||
# Load environment variables
|
||||
load_dotenv()
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
||||
)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Initialize Flask app
|
||||
app = Flask(__name__)
|
||||
app.config['SECRET_KEY'] = os.environ.get('SECRET_KEY', 'live-captions-secret')
|
||||
|
||||
# Initialize SocketIO with gevent
|
||||
socketio = SocketIO(
|
||||
app,
|
||||
cors_allowed_origins="*",
|
||||
async_mode='gevent'
|
||||
)
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Routes
|
||||
# =============================================================================
|
||||
|
||||
@app.route('/')
|
||||
def index():
|
||||
"""Serve the main page."""
|
||||
return render_template('index.html')
|
||||
|
||||
|
||||
@app.route('/api/health')
|
||||
def health():
|
||||
"""Health check endpoint."""
|
||||
return jsonify({'status': 'healthy'})
|
||||
|
||||
|
||||
@app.route('/api/settings', methods=['GET'])
|
||||
def get_settings():
|
||||
"""Get current user settings."""
|
||||
settings = database.get_settings()
|
||||
return jsonify(settings)
|
||||
|
||||
|
||||
@app.route('/api/settings', methods=['PUT'])
|
||||
def update_settings():
|
||||
"""Update user settings."""
|
||||
data = request.get_json()
|
||||
if not data:
|
||||
return jsonify({'error': 'No data provided'}), 400
|
||||
|
||||
settings = database.update_settings(data)
|
||||
|
||||
# Broadcast settings update to all clients
|
||||
socketio.emit('settings_updated', settings)
|
||||
|
||||
return jsonify(settings)
|
||||
|
||||
|
||||
@app.route('/api/settings/reset', methods=['POST'])
|
||||
def reset_settings():
|
||||
"""Reset settings to defaults."""
|
||||
settings = database.reset_settings()
|
||||
|
||||
# Broadcast settings update to all clients
|
||||
socketio.emit('settings_updated', settings)
|
||||
|
||||
return jsonify(settings)
|
||||
|
||||
|
||||
@app.route('/api/recordings', methods=['GET'])
|
||||
def list_recordings():
|
||||
"""List all saved recordings."""
|
||||
return jsonify(recordings.list_recordings())
|
||||
|
||||
|
||||
@app.route('/api/recordings/<filename>', methods=['GET'])
|
||||
def get_recording(filename):
|
||||
"""Get a specific recording's content."""
|
||||
recording = recordings.get_recording(filename)
|
||||
if recording:
|
||||
return jsonify(recording)
|
||||
return jsonify({'error': 'Recording not found'}), 404
|
||||
|
||||
|
||||
@app.route('/api/recordings/<filename>', methods=['DELETE'])
|
||||
def delete_recording(filename):
|
||||
"""Delete a specific recording."""
|
||||
if recordings.delete_recording(filename):
|
||||
return jsonify({'success': True})
|
||||
return jsonify({'error': 'Failed to delete recording'}), 400
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# WebSocket Events
|
||||
# =============================================================================
|
||||
|
||||
@socketio.on('connect')
|
||||
def handle_connect():
|
||||
"""Handle client connection."""
|
||||
logger.info(f"Client connected: {request.sid}")
|
||||
# Send current settings to the newly connected client
|
||||
settings = database.get_settings()
|
||||
emit('settings_updated', settings)
|
||||
|
||||
|
||||
@socketio.on('disconnect')
|
||||
def handle_disconnect():
|
||||
"""Handle client disconnection."""
|
||||
logger.info(f"Client disconnected: {request.sid}")
|
||||
|
||||
|
||||
@socketio.on('audio_data')
|
||||
def handle_audio_data(data):
|
||||
"""
|
||||
Handle incoming audio data from client.
|
||||
|
||||
Args:
|
||||
data: Dictionary containing 'audio' (base64 or bytes) and 'format'
|
||||
"""
|
||||
try:
|
||||
audio_bytes = data.get('audio')
|
||||
audio_format = data.get('format', 'webm')
|
||||
|
||||
if not audio_bytes:
|
||||
return
|
||||
|
||||
# Handle base64 encoded audio
|
||||
if isinstance(audio_bytes, str):
|
||||
import base64
|
||||
audio_bytes = base64.b64decode(audio_bytes)
|
||||
|
||||
# Transcribe audio
|
||||
text = transcriber.transcribe_audio(audio_bytes, format=audio_format)
|
||||
|
||||
if text:
|
||||
logger.info(f"Transcription: {text}")
|
||||
emit('transcription', {'text': text})
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error processing audio: {e}")
|
||||
emit('error', {'message': 'Failed to process audio'})
|
||||
|
||||
|
||||
@socketio.on('save_recording')
|
||||
def handle_save_recording(data):
|
||||
"""Handle saving a recording session."""
|
||||
client_id = request.sid
|
||||
|
||||
try:
|
||||
# Parse timestamps from client
|
||||
start_time_str = data.get('startTime')
|
||||
end_time_str = data.get('endTime')
|
||||
|
||||
if start_time_str:
|
||||
start_time = datetime.fromisoformat(start_time_str.replace('Z', '+00:00'))
|
||||
else:
|
||||
start_time = datetime.now()
|
||||
|
||||
if end_time_str:
|
||||
end_time = datetime.fromisoformat(end_time_str.replace('Z', '+00:00'))
|
||||
else:
|
||||
end_time = datetime.now()
|
||||
|
||||
transcript = data.get('transcript', '')
|
||||
word_count = data.get('wordCount', 0)
|
||||
|
||||
# Save the recording
|
||||
filename = recordings.save_recording(
|
||||
start_time=start_time,
|
||||
end_time=end_time,
|
||||
transcript=transcript,
|
||||
word_count=word_count,
|
||||
client_id=client_id
|
||||
)
|
||||
|
||||
if filename:
|
||||
logger.info(f"Recording saved: {filename}")
|
||||
emit('recording_saved', {'filename': filename})
|
||||
else:
|
||||
emit('recording_error', {'message': 'Failed to save recording'})
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error saving recording: {e}")
|
||||
emit('recording_error', {'message': str(e)})
|
||||
|
||||
|
||||
# =============================================================================
|
||||
# Startup
|
||||
# =============================================================================
|
||||
|
||||
def initialize():
|
||||
"""Initialize application components."""
|
||||
logger.info("Initializing Live Captions...")
|
||||
|
||||
# Initialize database
|
||||
database.init_db()
|
||||
logger.info("Database initialized")
|
||||
|
||||
# Preload Whisper model
|
||||
logger.info("Preloading Whisper model (this may take a moment)...")
|
||||
if transcriber.preload_model():
|
||||
logger.info("Whisper model ready")
|
||||
else:
|
||||
logger.warning("Failed to preload Whisper model")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
initialize()
|
||||
|
||||
host = os.environ.get('HOST', '0.0.0.0')
|
||||
port = int(os.environ.get('PORT', 5000))
|
||||
debug = os.environ.get('DEBUG', 'false').lower() == 'true'
|
||||
|
||||
logger.info(f"Starting Live Captions on {host}:{port}")
|
||||
socketio.run(app, host=host, port=port, debug=debug)
|
||||
168
database.py
Normal file
168
database.py
Normal file
@ -0,0 +1,168 @@
|
||||
"""
|
||||
SQLite database module for user settings persistence.
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import os
|
||||
from datetime import datetime
|
||||
|
||||
# Default settings
|
||||
DEFAULT_SETTINGS = {
|
||||
'font_family': 'Arial, sans-serif',
|
||||
'font_size': 32,
|
||||
'font_weight': 'normal',
|
||||
'text_color': '#ffffff',
|
||||
'background_color': '#1a1a2e',
|
||||
'background_opacity': 0.9,
|
||||
'max_words': 30,
|
||||
'text_align': 'center',
|
||||
'padding': 20,
|
||||
'border_radius': 10,
|
||||
}
|
||||
|
||||
|
||||
def get_db_path():
|
||||
"""Get database path from environment or use default."""
|
||||
return os.environ.get('DATABASE_PATH', 'data/settings.db')
|
||||
|
||||
|
||||
def get_connection():
|
||||
"""Create a database connection."""
|
||||
db_path = get_db_path()
|
||||
|
||||
# Ensure directory exists
|
||||
os.makedirs(os.path.dirname(db_path), exist_ok=True)
|
||||
|
||||
conn = sqlite3.connect(db_path)
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
|
||||
|
||||
def init_db():
|
||||
"""Initialize the database with the settings table."""
|
||||
conn = get_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Check if table exists
|
||||
cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='user_settings'")
|
||||
table_exists = cursor.fetchone() is not None
|
||||
|
||||
if table_exists:
|
||||
# Check if we need to migrate from max_lines to max_words
|
||||
cursor.execute("PRAGMA table_info(user_settings)")
|
||||
columns = [col[1] for col in cursor.fetchall()]
|
||||
|
||||
if 'max_lines' in columns and 'max_words' not in columns:
|
||||
# Add max_words column
|
||||
cursor.execute('ALTER TABLE user_settings ADD COLUMN max_words INTEGER DEFAULT 30')
|
||||
conn.commit()
|
||||
|
||||
# Remove old columns that are no longer needed (fade_delay, max_lines)
|
||||
# SQLite doesn't support DROP COLUMN easily, so we just ignore old columns
|
||||
else:
|
||||
# Create settings table
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS user_settings (
|
||||
id INTEGER PRIMARY KEY DEFAULT 1,
|
||||
font_family TEXT DEFAULT 'Arial, sans-serif',
|
||||
font_size INTEGER DEFAULT 32,
|
||||
font_weight TEXT DEFAULT 'normal',
|
||||
text_color TEXT DEFAULT '#ffffff',
|
||||
background_color TEXT DEFAULT '#1a1a2e',
|
||||
background_opacity REAL DEFAULT 0.9,
|
||||
max_words INTEGER DEFAULT 30,
|
||||
text_align TEXT DEFAULT 'center',
|
||||
padding INTEGER DEFAULT 20,
|
||||
border_radius INTEGER DEFAULT 10,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
''')
|
||||
|
||||
# Insert default settings if table is empty
|
||||
cursor.execute('SELECT COUNT(*) FROM user_settings')
|
||||
if cursor.fetchone()[0] == 0:
|
||||
columns = ', '.join(DEFAULT_SETTINGS.keys())
|
||||
placeholders = ', '.join(['?' for _ in DEFAULT_SETTINGS])
|
||||
cursor.execute(
|
||||
f'INSERT INTO user_settings ({columns}) VALUES ({placeholders})',
|
||||
list(DEFAULT_SETTINGS.values())
|
||||
)
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
|
||||
def get_settings():
|
||||
"""Fetch current user settings."""
|
||||
conn = get_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute('SELECT * FROM user_settings WHERE id = 1')
|
||||
row = cursor.fetchone()
|
||||
conn.close()
|
||||
|
||||
if row:
|
||||
# Convert to dict and exclude id and timestamps
|
||||
settings = dict(row)
|
||||
for key in ['id', 'created_at', 'updated_at', 'max_lines', 'fade_delay']:
|
||||
settings.pop(key, None)
|
||||
|
||||
# Ensure max_words exists (for migration)
|
||||
if 'max_words' not in settings:
|
||||
settings['max_words'] = DEFAULT_SETTINGS['max_words']
|
||||
|
||||
return settings
|
||||
|
||||
return DEFAULT_SETTINGS.copy()
|
||||
|
||||
|
||||
def update_settings(settings_dict):
|
||||
"""Update user settings with provided values."""
|
||||
if not settings_dict:
|
||||
return get_settings()
|
||||
|
||||
conn = get_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Build UPDATE query with only valid columns
|
||||
valid_columns = set(DEFAULT_SETTINGS.keys())
|
||||
updates = []
|
||||
values = []
|
||||
|
||||
for key, value in settings_dict.items():
|
||||
if key in valid_columns:
|
||||
updates.append(f'{key} = ?')
|
||||
values.append(value)
|
||||
|
||||
if updates:
|
||||
updates.append('updated_at = ?')
|
||||
values.append(datetime.now().isoformat())
|
||||
|
||||
query = f'UPDATE user_settings SET {", ".join(updates)} WHERE id = 1'
|
||||
cursor.execute(query, values)
|
||||
conn.commit()
|
||||
|
||||
conn.close()
|
||||
return get_settings()
|
||||
|
||||
|
||||
def reset_settings():
|
||||
"""Reset all settings to defaults."""
|
||||
conn = get_connection()
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Delete existing and insert defaults
|
||||
cursor.execute('DELETE FROM user_settings')
|
||||
|
||||
columns = ', '.join(DEFAULT_SETTINGS.keys())
|
||||
placeholders = ', '.join(['?' for _ in DEFAULT_SETTINGS])
|
||||
cursor.execute(
|
||||
f'INSERT INTO user_settings ({columns}) VALUES ({placeholders})',
|
||||
list(DEFAULT_SETTINGS.values())
|
||||
)
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
return DEFAULT_SETTINGS.copy()
|
||||
21
docker-compose.gpu.yml
Normal file
21
docker-compose.gpu.yml
Normal file
@ -0,0 +1,21 @@
|
||||
# GPU override for docker-compose
|
||||
# Usage: docker compose -f docker-compose.yml -f docker-compose.gpu.yml up --build
|
||||
#
|
||||
# Prerequisites:
|
||||
# 1. NVIDIA GPU with driver installed
|
||||
# 2. NVIDIA Container Toolkit installed
|
||||
# 3. Set WHISPER_DEVICE=cuda in .env
|
||||
# 4. Set WHISPER_COMPUTE_TYPE=float16 in .env (recommended)
|
||||
|
||||
services:
|
||||
live-captions:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile.gpu
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
29
docker-compose.yml
Normal file
29
docker-compose.yml
Normal file
@ -0,0 +1,29 @@
|
||||
services:
|
||||
live-captions:
|
||||
build: .
|
||||
container_name: live-captions
|
||||
ports:
|
||||
- "${PORT:-5000}:5000"
|
||||
volumes:
|
||||
# Persist SQLite database
|
||||
- ./data:/app/data
|
||||
# Persist Whisper models
|
||||
- whisper-models:/home/appuser/.cache/huggingface
|
||||
# Persist recordings
|
||||
- ./recordings:/app/recordings
|
||||
env_file:
|
||||
- .env
|
||||
environment:
|
||||
- HOST=0.0.0.0
|
||||
- PORT=5000
|
||||
restart: unless-stopped
|
||||
healthcheck:
|
||||
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:5000/api/health')"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 60s
|
||||
|
||||
volumes:
|
||||
whisper-models:
|
||||
name: live-captions-whisper-models
|
||||
208
recordings.py
Normal file
208
recordings.py
Normal file
@ -0,0 +1,208 @@
|
||||
"""
|
||||
Recording session management and file saving.
|
||||
"""
|
||||
|
||||
import os
|
||||
import logging
|
||||
from datetime import datetime
|
||||
from typing import Optional
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Default recordings directory
|
||||
RECORDINGS_DIR = os.environ.get('RECORDINGS_PATH', '/app/recordings')
|
||||
|
||||
|
||||
def ensure_recordings_dir():
|
||||
"""Ensure the recordings directory exists."""
|
||||
os.makedirs(RECORDINGS_DIR, exist_ok=True)
|
||||
return RECORDINGS_DIR
|
||||
|
||||
|
||||
def generate_filename(start_time: datetime) -> str:
|
||||
"""
|
||||
Generate a filename from the session start time.
|
||||
Format: YYYY-MM-DD_HH-MM-SS_captions.md
|
||||
"""
|
||||
return start_time.strftime('%Y-%m-%d_%H-%M-%S_captions.md')
|
||||
|
||||
|
||||
def calculate_duration(start_time: datetime, end_time: datetime) -> str:
|
||||
"""Calculate and format duration as HH:MM:SS."""
|
||||
delta = end_time - start_time
|
||||
total_seconds = int(delta.total_seconds())
|
||||
hours, remainder = divmod(total_seconds, 3600)
|
||||
minutes, seconds = divmod(remainder, 60)
|
||||
return f"{hours:02d}:{minutes:02d}:{seconds:02d}"
|
||||
|
||||
|
||||
def get_whisper_model_name() -> str:
|
||||
"""Get the configured Whisper model name."""
|
||||
return os.environ.get('WHISPER_MODEL', 'base')
|
||||
|
||||
|
||||
def save_recording(
|
||||
start_time: datetime,
|
||||
end_time: datetime,
|
||||
transcript: str,
|
||||
word_count: int,
|
||||
client_id: str
|
||||
) -> Optional[str]:
|
||||
"""
|
||||
Save a recording session to a markdown file.
|
||||
|
||||
Args:
|
||||
start_time: Session start datetime
|
||||
end_time: Session end datetime
|
||||
transcript: Full transcript text
|
||||
word_count: Number of words in transcript
|
||||
client_id: WebSocket client session ID
|
||||
|
||||
Returns:
|
||||
Filename if successful, None if failed
|
||||
"""
|
||||
try:
|
||||
ensure_recordings_dir()
|
||||
|
||||
filename = generate_filename(start_time)
|
||||
filepath = os.path.join(RECORDINGS_DIR, filename)
|
||||
|
||||
duration = calculate_duration(start_time, end_time)
|
||||
model_name = get_whisper_model_name()
|
||||
|
||||
# Build markdown content with frontmatter
|
||||
content = f"""---
|
||||
session_start: {start_time.isoformat()}
|
||||
session_end: {end_time.isoformat()}
|
||||
duration: {duration}
|
||||
whisper_model: {model_name}
|
||||
word_count: {word_count}
|
||||
---
|
||||
|
||||
# Live Captions Recording
|
||||
|
||||
**Session Start:** {start_time.strftime('%Y-%m-%d %H:%M:%S')}
|
||||
**Session End:** {end_time.strftime('%Y-%m-%d %H:%M:%S')}
|
||||
**Duration:** {duration}
|
||||
**Model:** {model_name}
|
||||
**Words:** {word_count}
|
||||
|
||||
---
|
||||
|
||||
## Transcript
|
||||
|
||||
{transcript}
|
||||
"""
|
||||
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(content)
|
||||
|
||||
logger.info(f"Recording saved: {filename} ({word_count} words)")
|
||||
return filename
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to save recording: {e}")
|
||||
return None
|
||||
|
||||
|
||||
def list_recordings() -> list:
|
||||
"""
|
||||
List all recording files, sorted by date descending.
|
||||
|
||||
Returns:
|
||||
List of recording metadata dicts
|
||||
"""
|
||||
ensure_recordings_dir()
|
||||
recordings = []
|
||||
|
||||
try:
|
||||
for filename in os.listdir(RECORDINGS_DIR):
|
||||
if filename.endswith('_captions.md'):
|
||||
filepath = os.path.join(RECORDINGS_DIR, filename)
|
||||
stat = os.stat(filepath)
|
||||
|
||||
# Parse date from filename (YYYY-MM-DD_HH-MM-SS_captions.md)
|
||||
try:
|
||||
date_str = filename.replace('_captions.md', '')
|
||||
date_parts = date_str.split('_')
|
||||
display_date = f"{date_parts[0]} {date_parts[1].replace('-', ':')}"
|
||||
except (IndexError, ValueError):
|
||||
display_date = filename
|
||||
|
||||
recordings.append({
|
||||
'filename': filename,
|
||||
'date': display_date,
|
||||
'size': stat.st_size,
|
||||
'created': datetime.fromtimestamp(stat.st_mtime).isoformat()
|
||||
})
|
||||
|
||||
# Sort by filename descending (newest first)
|
||||
recordings.sort(key=lambda x: x['filename'], reverse=True)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to list recordings: {e}")
|
||||
|
||||
return recordings
|
||||
|
||||
|
||||
def get_recording(filename: str) -> Optional[dict]:
|
||||
"""
|
||||
Get a specific recording's content.
|
||||
|
||||
Args:
|
||||
filename: The recording filename
|
||||
|
||||
Returns:
|
||||
Dict with filename and content, or None if not found
|
||||
"""
|
||||
ensure_recordings_dir()
|
||||
|
||||
# Sanitize filename to prevent path traversal
|
||||
safe_filename = os.path.basename(filename)
|
||||
if not safe_filename.endswith('_captions.md'):
|
||||
return None
|
||||
|
||||
filepath = os.path.join(RECORDINGS_DIR, safe_filename)
|
||||
|
||||
try:
|
||||
if os.path.exists(filepath):
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
return {
|
||||
'filename': safe_filename,
|
||||
'content': content
|
||||
}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to read recording {safe_filename}: {e}")
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def delete_recording(filename: str) -> bool:
|
||||
"""
|
||||
Delete a specific recording.
|
||||
|
||||
Args:
|
||||
filename: The recording filename
|
||||
|
||||
Returns:
|
||||
True if deleted, False otherwise
|
||||
"""
|
||||
ensure_recordings_dir()
|
||||
|
||||
# Sanitize filename to prevent path traversal
|
||||
safe_filename = os.path.basename(filename)
|
||||
if not safe_filename.endswith('_captions.md'):
|
||||
return False
|
||||
|
||||
filepath = os.path.join(RECORDINGS_DIR, safe_filename)
|
||||
|
||||
try:
|
||||
if os.path.exists(filepath):
|
||||
os.remove(filepath)
|
||||
logger.info(f"Recording deleted: {safe_filename}")
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to delete recording {safe_filename}: {e}")
|
||||
|
||||
return False
|
||||
9
requirements.txt
Normal file
9
requirements.txt
Normal file
@ -0,0 +1,9 @@
|
||||
flask>=3.0.0
|
||||
flask-socketio>=5.3.0
|
||||
faster-whisper>=1.0.0
|
||||
pydub>=0.25.1
|
||||
python-dotenv>=1.0.0
|
||||
python-engineio>=4.8.0
|
||||
python-socketio>=5.10.0
|
||||
gevent>=24.2.1
|
||||
gevent-websocket>=0.10.1
|
||||
567
static/css/style.css
Normal file
567
static/css/style.css
Normal file
@ -0,0 +1,567 @@
|
||||
/**
|
||||
* Live Captions - Stylesheet
|
||||
*/
|
||||
|
||||
/* Reset and Base */
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
:root {
|
||||
--bg-primary: #0d0d1a;
|
||||
--bg-secondary: #1a1a2e;
|
||||
--bg-tertiary: #252542;
|
||||
--text-primary: #ffffff;
|
||||
--text-secondary: #a0a0b0;
|
||||
--accent: #4a9eff;
|
||||
--accent-hover: #6ab0ff;
|
||||
--danger: #ff4a6a;
|
||||
--danger-hover: #ff6a85;
|
||||
--success: #4aff8a;
|
||||
--warning: #ffa64a;
|
||||
--border-radius: 8px;
|
||||
--transition: 0.2s ease;
|
||||
}
|
||||
|
||||
html, body {
|
||||
height: 100%;
|
||||
font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
|
||||
background-color: var(--bg-primary);
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
#app {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
height: 100vh;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
/* Caption Container */
|
||||
#caption-container {
|
||||
flex: 1;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
justify-content: center;
|
||||
background-color: rgba(26, 26, 46, 0.9);
|
||||
border-radius: 10px;
|
||||
padding: 20px;
|
||||
margin-bottom: 20px;
|
||||
overflow: hidden;
|
||||
font-size: 32px;
|
||||
font-family: Arial, sans-serif;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
#captions {
|
||||
line-height: 1.4;
|
||||
word-wrap: break-word;
|
||||
overflow-wrap: break-word;
|
||||
}
|
||||
|
||||
/* Controls Bar */
|
||||
#controls {
|
||||
display: flex;
|
||||
gap: 10px;
|
||||
justify-content: center;
|
||||
align-items: center;
|
||||
padding: 15px;
|
||||
background-color: var(--bg-secondary);
|
||||
border-radius: var(--border-radius);
|
||||
}
|
||||
|
||||
/* Buttons */
|
||||
.btn {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
padding: 12px 24px;
|
||||
border: none;
|
||||
border-radius: var(--border-radius);
|
||||
font-size: 16px;
|
||||
font-weight: 500;
|
||||
cursor: pointer;
|
||||
transition: background-color var(--transition), transform var(--transition);
|
||||
}
|
||||
|
||||
.btn:hover:not(:disabled) {
|
||||
transform: translateY(-1px);
|
||||
}
|
||||
|
||||
.btn:disabled {
|
||||
opacity: 0.5;
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.btn-primary {
|
||||
background-color: var(--accent);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-primary:hover:not(:disabled) {
|
||||
background-color: var(--accent-hover);
|
||||
}
|
||||
|
||||
.btn-danger {
|
||||
background-color: var(--danger);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.btn-danger:hover:not(:disabled) {
|
||||
background-color: var(--danger-hover);
|
||||
}
|
||||
|
||||
.btn-secondary {
|
||||
background-color: var(--bg-tertiary);
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.btn-secondary:hover:not(:disabled) {
|
||||
background-color: #323258;
|
||||
}
|
||||
|
||||
.btn-success {
|
||||
background-color: var(--success);
|
||||
color: #000;
|
||||
}
|
||||
|
||||
.btn-success:hover:not(:disabled) {
|
||||
background-color: #5aff9a;
|
||||
}
|
||||
|
||||
/* Toggle Switch */
|
||||
.toggle-switch {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 10px;
|
||||
cursor: pointer;
|
||||
user-select: none;
|
||||
padding: 8px 12px;
|
||||
background-color: var(--bg-tertiary);
|
||||
border-radius: var(--border-radius);
|
||||
}
|
||||
|
||||
.toggle-switch input {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.toggle-slider {
|
||||
position: relative;
|
||||
width: 44px;
|
||||
height: 24px;
|
||||
background-color: var(--text-secondary);
|
||||
border-radius: 12px;
|
||||
transition: background-color var(--transition);
|
||||
}
|
||||
|
||||
.toggle-slider::before {
|
||||
content: '';
|
||||
position: absolute;
|
||||
top: 3px;
|
||||
left: 3px;
|
||||
width: 18px;
|
||||
height: 18px;
|
||||
background-color: white;
|
||||
border-radius: 50%;
|
||||
transition: transform var(--transition);
|
||||
}
|
||||
|
||||
.toggle-switch input:checked + .toggle-slider {
|
||||
background-color: var(--success);
|
||||
}
|
||||
|
||||
.toggle-switch input:checked + .toggle-slider::before {
|
||||
transform: translateX(20px);
|
||||
}
|
||||
|
||||
.toggle-label {
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.btn-icon {
|
||||
width: 48px;
|
||||
height: 48px;
|
||||
padding: 0;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
background-color: var(--bg-tertiary);
|
||||
color: var(--text-primary);
|
||||
font-size: 20px;
|
||||
}
|
||||
|
||||
.btn-icon:hover:not(:disabled) {
|
||||
background-color: #323258;
|
||||
}
|
||||
|
||||
.icon {
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
/* Status Indicator */
|
||||
#status {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
justify-content: center;
|
||||
padding: 10px;
|
||||
font-size: 14px;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.dot {
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
border-radius: 50%;
|
||||
background-color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.dot.connected {
|
||||
background-color: var(--success);
|
||||
}
|
||||
|
||||
.dot.recording {
|
||||
background-color: var(--danger);
|
||||
animation: pulse 1s infinite;
|
||||
}
|
||||
|
||||
.dot.disconnected {
|
||||
background-color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.dot.error {
|
||||
background-color: var(--warning);
|
||||
}
|
||||
|
||||
@keyframes pulse {
|
||||
0%, 100% { opacity: 1; }
|
||||
50% { opacity: 0.5; }
|
||||
}
|
||||
|
||||
/* Settings Panel */
|
||||
.panel {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
right: 0;
|
||||
width: 350px;
|
||||
height: 100vh;
|
||||
background-color: var(--bg-secondary);
|
||||
box-shadow: -5px 0 20px rgba(0, 0, 0, 0.3);
|
||||
z-index: 1000;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
transition: transform var(--transition);
|
||||
}
|
||||
|
||||
.panel.hidden {
|
||||
transform: translateX(100%);
|
||||
}
|
||||
|
||||
.panel-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
padding: 20px;
|
||||
border-bottom: 1px solid var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.panel-header h2 {
|
||||
font-size: 20px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.btn-close {
|
||||
width: 36px;
|
||||
height: 36px;
|
||||
border: none;
|
||||
background: var(--bg-tertiary);
|
||||
color: var(--text-primary);
|
||||
font-size: 24px;
|
||||
border-radius: 50%;
|
||||
cursor: pointer;
|
||||
transition: background-color var(--transition);
|
||||
}
|
||||
|
||||
.btn-close:hover {
|
||||
background-color: var(--danger);
|
||||
}
|
||||
|
||||
.panel-content {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
/* Settings Groups */
|
||||
.setting-group {
|
||||
margin-bottom: 25px;
|
||||
}
|
||||
|
||||
.setting-group h3 {
|
||||
font-size: 14px;
|
||||
font-weight: 600;
|
||||
color: var(--accent);
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 1px;
|
||||
margin-bottom: 15px;
|
||||
}
|
||||
|
||||
.setting-group label {
|
||||
display: block;
|
||||
font-size: 14px;
|
||||
color: var(--text-secondary);
|
||||
margin-bottom: 5px;
|
||||
margin-top: 12px;
|
||||
}
|
||||
|
||||
.setting-group label:first-of-type {
|
||||
margin-top: 0;
|
||||
}
|
||||
|
||||
/* Form Controls */
|
||||
select,
|
||||
input[type="text"] {
|
||||
width: 100%;
|
||||
padding: 10px 12px;
|
||||
background-color: var(--bg-tertiary);
|
||||
border: 1px solid transparent;
|
||||
border-radius: var(--border-radius);
|
||||
color: var(--text-primary);
|
||||
font-size: 14px;
|
||||
transition: border-color var(--transition);
|
||||
}
|
||||
|
||||
select:focus,
|
||||
input[type="text"]:focus {
|
||||
outline: none;
|
||||
border-color: var(--accent);
|
||||
}
|
||||
|
||||
input[type="range"] {
|
||||
width: 100%;
|
||||
height: 6px;
|
||||
background: var(--bg-tertiary);
|
||||
border-radius: 3px;
|
||||
appearance: none;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
input[type="range"]::-webkit-slider-thumb {
|
||||
appearance: none;
|
||||
width: 18px;
|
||||
height: 18px;
|
||||
background: var(--accent);
|
||||
border-radius: 50%;
|
||||
cursor: pointer;
|
||||
transition: background-color var(--transition);
|
||||
}
|
||||
|
||||
input[type="range"]::-webkit-slider-thumb:hover {
|
||||
background: var(--accent-hover);
|
||||
}
|
||||
|
||||
input[type="range"]::-moz-range-thumb {
|
||||
width: 18px;
|
||||
height: 18px;
|
||||
background: var(--accent);
|
||||
border-radius: 50%;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
input[type="color"] {
|
||||
width: 100%;
|
||||
height: 40px;
|
||||
padding: 2px;
|
||||
background-color: var(--bg-tertiary);
|
||||
border: 1px solid transparent;
|
||||
border-radius: var(--border-radius);
|
||||
cursor: pointer;
|
||||
}
|
||||
|
||||
input[type="color"]::-webkit-color-swatch-wrapper {
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
input[type="color"]::-webkit-color-swatch {
|
||||
border: none;
|
||||
border-radius: calc(var(--border-radius) - 3px);
|
||||
}
|
||||
|
||||
/* Setting Actions */
|
||||
.setting-actions {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 10px;
|
||||
margin-top: 20px;
|
||||
padding-top: 20px;
|
||||
border-top: 1px solid var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.setting-actions .btn {
|
||||
width: 100%;
|
||||
justify-content: center;
|
||||
}
|
||||
|
||||
/* Overlay */
|
||||
#overlay {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
left: 0;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background-color: rgba(0, 0, 0, 0.5);
|
||||
z-index: 999;
|
||||
transition: opacity var(--transition);
|
||||
}
|
||||
|
||||
#overlay.hidden {
|
||||
opacity: 0;
|
||||
pointer-events: none;
|
||||
}
|
||||
|
||||
/* Scrollbar Styling */
|
||||
::-webkit-scrollbar {
|
||||
width: 8px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-track {
|
||||
background: var(--bg-tertiary);
|
||||
border-radius: 4px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb {
|
||||
background: var(--text-secondary);
|
||||
border-radius: 4px;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb:hover {
|
||||
background: var(--accent);
|
||||
}
|
||||
|
||||
/* Recordings Panel */
|
||||
.recordings-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.recordings-empty {
|
||||
color: var(--text-secondary);
|
||||
text-align: center;
|
||||
padding: 40px 20px;
|
||||
}
|
||||
|
||||
.recording-item {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
padding: 12px 15px;
|
||||
background-color: var(--bg-tertiary);
|
||||
border-radius: var(--border-radius);
|
||||
cursor: pointer;
|
||||
transition: background-color var(--transition);
|
||||
}
|
||||
|
||||
.recording-item:hover {
|
||||
background-color: #323258;
|
||||
}
|
||||
|
||||
.recording-info {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 4px;
|
||||
}
|
||||
|
||||
.recording-date {
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
color: var(--text-primary);
|
||||
}
|
||||
|
||||
.recording-meta {
|
||||
font-size: 12px;
|
||||
color: var(--text-secondary);
|
||||
}
|
||||
|
||||
.recording-arrow {
|
||||
color: var(--text-secondary);
|
||||
font-size: 18px;
|
||||
}
|
||||
|
||||
/* Recording Viewer */
|
||||
.recording-viewer {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
height: 100%;
|
||||
}
|
||||
|
||||
.recording-viewer.hidden {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.viewer-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 12px;
|
||||
margin-bottom: 15px;
|
||||
padding-bottom: 15px;
|
||||
border-bottom: 1px solid var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.viewer-filename {
|
||||
font-size: 12px;
|
||||
color: var(--text-secondary);
|
||||
overflow: hidden;
|
||||
text-overflow: ellipsis;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.viewer-content {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 15px;
|
||||
background-color: var(--bg-tertiary);
|
||||
border-radius: var(--border-radius);
|
||||
font-size: 14px;
|
||||
line-height: 1.6;
|
||||
white-space: pre-wrap;
|
||||
word-wrap: break-word;
|
||||
}
|
||||
|
||||
.viewer-actions {
|
||||
display: flex;
|
||||
justify-content: flex-end;
|
||||
margin-top: 15px;
|
||||
padding-top: 15px;
|
||||
border-top: 1px solid var(--bg-tertiary);
|
||||
}
|
||||
|
||||
.btn-small {
|
||||
padding: 8px 16px;
|
||||
font-size: 13px;
|
||||
}
|
||||
|
||||
/* Responsive */
|
||||
@media (max-width: 600px) {
|
||||
#app {
|
||||
padding: 10px;
|
||||
}
|
||||
|
||||
.panel {
|
||||
width: 100%;
|
||||
}
|
||||
|
||||
#controls {
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.btn {
|
||||
padding: 10px 16px;
|
||||
font-size: 14px;
|
||||
}
|
||||
}
|
||||
355
static/js/app.js
Normal file
355
static/js/app.js
Normal file
@ -0,0 +1,355 @@
|
||||
/**
|
||||
* Live Captions - Main Application
|
||||
* Handles audio capture and WebSocket communication
|
||||
*/
|
||||
|
||||
const App = {
|
||||
// WebSocket connection
|
||||
socket: null,
|
||||
|
||||
// Audio recording
|
||||
mediaRecorder: null,
|
||||
audioStream: null,
|
||||
audioChunks: [],
|
||||
isRecording: false,
|
||||
recordingInterval: null,
|
||||
|
||||
// Continuous caption stream
|
||||
wordBuffer: [],
|
||||
pendingWords: [],
|
||||
wordAnimationTimer: null,
|
||||
|
||||
// Auto-save recording state
|
||||
sessionStartTime: null,
|
||||
sessionTranscript: [],
|
||||
|
||||
// DOM elements
|
||||
elements: {},
|
||||
|
||||
/**
|
||||
* Initialize the application
|
||||
*/
|
||||
init() {
|
||||
this.cacheElements();
|
||||
this.bindEvents();
|
||||
this.connectSocket();
|
||||
|
||||
// Initialize settings module
|
||||
Settings.init();
|
||||
},
|
||||
|
||||
/**
|
||||
* Cache DOM element references
|
||||
*/
|
||||
cacheElements() {
|
||||
this.elements = {
|
||||
btnStart: document.getElementById('btn-start'),
|
||||
btnStop: document.getElementById('btn-stop'),
|
||||
btnClear: document.getElementById('btn-clear'),
|
||||
autoSaveToggle: document.getElementById('auto-save-toggle'),
|
||||
captions: document.getElementById('captions'),
|
||||
statusDot: document.getElementById('status-dot'),
|
||||
statusText: document.getElementById('status-text'),
|
||||
};
|
||||
},
|
||||
|
||||
/**
|
||||
* Bind event listeners
|
||||
*/
|
||||
bindEvents() {
|
||||
this.elements.btnStart.addEventListener('click', () => this.startRecording());
|
||||
this.elements.btnStop.addEventListener('click', () => this.stopRecording());
|
||||
this.elements.btnClear.addEventListener('click', () => this.clearCaptions());
|
||||
|
||||
// Load auto-save preference from localStorage
|
||||
const savedPref = localStorage.getItem('autoSaveEnabled');
|
||||
if (savedPref === 'true') {
|
||||
this.elements.autoSaveToggle.checked = true;
|
||||
}
|
||||
|
||||
// Save preference when toggled
|
||||
this.elements.autoSaveToggle.addEventListener('change', (e) => {
|
||||
localStorage.setItem('autoSaveEnabled', e.target.checked);
|
||||
});
|
||||
},
|
||||
|
||||
/**
|
||||
* Connect to WebSocket server
|
||||
*/
|
||||
connectSocket() {
|
||||
this.socket = io();
|
||||
|
||||
this.socket.on('connect', () => {
|
||||
console.log('Connected to server');
|
||||
this.setStatus('connected', 'Connected');
|
||||
});
|
||||
|
||||
this.socket.on('disconnect', () => {
|
||||
console.log('Disconnected from server');
|
||||
this.setStatus('disconnected', 'Disconnected');
|
||||
});
|
||||
|
||||
this.socket.on('transcription', (data) => {
|
||||
this.addWords(data.text);
|
||||
});
|
||||
|
||||
this.socket.on('settings_updated', (settings) => {
|
||||
Settings.applySettings(settings);
|
||||
});
|
||||
|
||||
this.socket.on('error', (data) => {
|
||||
console.error('Server error:', data.message);
|
||||
});
|
||||
|
||||
this.socket.on('recording_saved', (data) => {
|
||||
console.log('Recording saved:', data.filename);
|
||||
});
|
||||
|
||||
this.socket.on('recording_error', (data) => {
|
||||
console.error('Recording error:', data.message);
|
||||
});
|
||||
},
|
||||
|
||||
/**
|
||||
* Update status indicator
|
||||
*/
|
||||
setStatus(state, text) {
|
||||
this.elements.statusDot.className = `dot ${state}`;
|
||||
this.elements.statusText.textContent = text;
|
||||
},
|
||||
|
||||
/**
|
||||
* Start audio recording
|
||||
*/
|
||||
async startRecording() {
|
||||
try {
|
||||
this.audioStream = await navigator.mediaDevices.getUserMedia({
|
||||
audio: {
|
||||
echoCancellation: true,
|
||||
noiseSuppression: true,
|
||||
sampleRate: 16000,
|
||||
}
|
||||
});
|
||||
|
||||
this.isRecording = true;
|
||||
this.elements.btnStart.disabled = true;
|
||||
this.elements.btnStop.disabled = false;
|
||||
this.setStatus('recording', 'Recording...');
|
||||
|
||||
// Reset session transcript for auto-save
|
||||
this.sessionStartTime = new Date();
|
||||
this.sessionTranscript = [];
|
||||
|
||||
// Start the recording cycle
|
||||
this.startRecordingCycle();
|
||||
|
||||
} catch (error) {
|
||||
console.error('Error starting recording:', error);
|
||||
this.setStatus('error', 'Microphone access denied');
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Start a recording cycle - record for a duration, then send and restart
|
||||
*/
|
||||
startRecordingCycle() {
|
||||
if (!this.isRecording || !this.audioStream) return;
|
||||
|
||||
// Determine best supported MIME type
|
||||
let mimeType = 'audio/webm';
|
||||
if (MediaRecorder.isTypeSupported('audio/webm;codecs=opus')) {
|
||||
mimeType = 'audio/webm;codecs=opus';
|
||||
}
|
||||
|
||||
this.audioChunks = [];
|
||||
this.mediaRecorder = new MediaRecorder(this.audioStream, { mimeType });
|
||||
|
||||
this.mediaRecorder.ondataavailable = (event) => {
|
||||
if (event.data.size > 0) {
|
||||
this.audioChunks.push(event.data);
|
||||
}
|
||||
};
|
||||
|
||||
this.mediaRecorder.onstop = () => {
|
||||
// Create a complete blob from all chunks
|
||||
if (this.audioChunks.length > 0) {
|
||||
const blob = new Blob(this.audioChunks, { type: 'audio/webm' });
|
||||
this.sendAudioBlob(blob);
|
||||
}
|
||||
|
||||
// Start next cycle if still recording
|
||||
if (this.isRecording) {
|
||||
this.startRecordingCycle();
|
||||
}
|
||||
};
|
||||
|
||||
// Start recording
|
||||
this.mediaRecorder.start();
|
||||
|
||||
// Stop after the configured duration to get a complete blob
|
||||
// Using 1.5 seconds for more responsive streaming
|
||||
const chunkDuration = 1500;
|
||||
this.recordingInterval = setTimeout(() => {
|
||||
if (this.mediaRecorder && this.mediaRecorder.state === 'recording') {
|
||||
this.mediaRecorder.stop();
|
||||
}
|
||||
}, chunkDuration);
|
||||
},
|
||||
|
||||
/**
|
||||
* Stop audio recording
|
||||
*/
|
||||
stopRecording() {
|
||||
this.isRecording = false;
|
||||
|
||||
// Clear the recording interval
|
||||
if (this.recordingInterval) {
|
||||
clearTimeout(this.recordingInterval);
|
||||
this.recordingInterval = null;
|
||||
}
|
||||
|
||||
// Stop the media recorder
|
||||
if (this.mediaRecorder && this.mediaRecorder.state === 'recording') {
|
||||
this.mediaRecorder.stop();
|
||||
}
|
||||
|
||||
// Stop all tracks
|
||||
if (this.audioStream) {
|
||||
this.audioStream.getTracks().forEach(track => track.stop());
|
||||
this.audioStream = null;
|
||||
}
|
||||
|
||||
this.elements.btnStart.disabled = false;
|
||||
this.elements.btnStop.disabled = true;
|
||||
this.setStatus('connected', 'Connected');
|
||||
|
||||
// Auto-save if enabled and we have content
|
||||
if (this.elements.autoSaveToggle.checked && this.sessionTranscript.length > 0) {
|
||||
this.saveRecording();
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Send complete audio blob to server
|
||||
*/
|
||||
sendAudioBlob(blob) {
|
||||
const reader = new FileReader();
|
||||
reader.onloadend = () => {
|
||||
// Get base64 data without the data URL prefix
|
||||
const base64 = reader.result.split(',')[1];
|
||||
|
||||
this.socket.emit('audio_data', {
|
||||
audio: base64,
|
||||
format: 'webm'
|
||||
});
|
||||
};
|
||||
reader.readAsDataURL(blob);
|
||||
},
|
||||
|
||||
/**
|
||||
* Add words to the continuous caption stream
|
||||
*/
|
||||
addWords(text) {
|
||||
if (!text.trim()) return;
|
||||
|
||||
// Split incoming text into words
|
||||
const newWords = text.trim().split(/\s+/);
|
||||
|
||||
// Add to pending queue for animated display
|
||||
this.pendingWords.push(...newWords);
|
||||
|
||||
// Accumulate to session transcript for auto-save
|
||||
if (this.isRecording) {
|
||||
this.sessionTranscript.push(...newWords);
|
||||
}
|
||||
|
||||
// Start animation if not already running
|
||||
if (!this.wordAnimationTimer) {
|
||||
this.animateNextWord();
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Animate words appearing one by one
|
||||
*/
|
||||
animateNextWord() {
|
||||
if (this.pendingWords.length === 0) {
|
||||
this.wordAnimationTimer = null;
|
||||
return;
|
||||
}
|
||||
|
||||
// Get next word from queue
|
||||
const word = this.pendingWords.shift();
|
||||
this.wordBuffer.push(word);
|
||||
|
||||
// Get max words from settings
|
||||
const maxWords = Settings.current.max_words || 30;
|
||||
|
||||
// Trim buffer to max words
|
||||
while (this.wordBuffer.length > maxWords) {
|
||||
this.wordBuffer.shift();
|
||||
}
|
||||
|
||||
// Update display
|
||||
this.updateCaptionDisplay();
|
||||
|
||||
// Calculate delay based on pending words
|
||||
// Faster if more words pending, slower if caught up
|
||||
const baseDelay = 80; // ms per word
|
||||
const minDelay = 30;
|
||||
const delay = this.pendingWords.length > 10 ? minDelay : baseDelay;
|
||||
|
||||
// Schedule next word
|
||||
this.wordAnimationTimer = setTimeout(() => {
|
||||
this.animateNextWord();
|
||||
}, delay);
|
||||
},
|
||||
|
||||
/**
|
||||
* Update the caption display with current word buffer
|
||||
*/
|
||||
updateCaptionDisplay() {
|
||||
const text = this.wordBuffer.join(' ');
|
||||
this.elements.captions.textContent = text;
|
||||
},
|
||||
|
||||
/**
|
||||
* Clear all captions
|
||||
*/
|
||||
clearCaptions() {
|
||||
// Clear animation timer
|
||||
if (this.wordAnimationTimer) {
|
||||
clearTimeout(this.wordAnimationTimer);
|
||||
this.wordAnimationTimer = null;
|
||||
}
|
||||
this.wordBuffer = [];
|
||||
this.pendingWords = [];
|
||||
this.elements.captions.textContent = '';
|
||||
},
|
||||
|
||||
/**
|
||||
* Save the current recording session
|
||||
*/
|
||||
saveRecording() {
|
||||
if (!this.sessionStartTime) return;
|
||||
|
||||
const endTime = new Date();
|
||||
const transcript = this.sessionTranscript.join(' ');
|
||||
|
||||
this.socket.emit('save_recording', {
|
||||
startTime: this.sessionStartTime.toISOString(),
|
||||
endTime: endTime.toISOString(),
|
||||
transcript: transcript,
|
||||
wordCount: this.sessionTranscript.length
|
||||
});
|
||||
|
||||
// Reset session state
|
||||
this.sessionStartTime = null;
|
||||
this.sessionTranscript = [];
|
||||
}
|
||||
};
|
||||
|
||||
// Initialize when DOM is ready
|
||||
document.addEventListener('DOMContentLoaded', () => {
|
||||
App.init();
|
||||
});
|
||||
204
static/js/recordings.js
Normal file
204
static/js/recordings.js
Normal file
@ -0,0 +1,204 @@
|
||||
/**
|
||||
* Live Captions - Recordings Panel
|
||||
* Handles viewing and managing saved recordings
|
||||
*/
|
||||
|
||||
const Recordings = {
|
||||
// Current state
|
||||
recordings: [],
|
||||
currentRecording: null,
|
||||
|
||||
// DOM elements
|
||||
elements: {},
|
||||
|
||||
/**
|
||||
* Initialize the recordings panel
|
||||
*/
|
||||
init() {
|
||||
this.cacheElements();
|
||||
this.bindEvents();
|
||||
},
|
||||
|
||||
/**
|
||||
* Cache DOM element references
|
||||
*/
|
||||
cacheElements() {
|
||||
this.elements = {
|
||||
btnRecordings: document.getElementById('btn-recordings'),
|
||||
btnClose: document.getElementById('btn-close-recordings'),
|
||||
btnBackToList: document.getElementById('btn-back-to-list'),
|
||||
btnDelete: document.getElementById('btn-delete-recording'),
|
||||
panel: document.getElementById('recordings-panel'),
|
||||
overlay: document.getElementById('overlay'),
|
||||
recordingsList: document.getElementById('recordings-list'),
|
||||
recordingViewer: document.getElementById('recording-viewer'),
|
||||
viewerFilename: document.getElementById('viewer-filename'),
|
||||
viewerContent: document.getElementById('viewer-content'),
|
||||
};
|
||||
},
|
||||
|
||||
/**
|
||||
* Bind event listeners
|
||||
*/
|
||||
bindEvents() {
|
||||
this.elements.btnRecordings.addEventListener('click', () => this.openPanel());
|
||||
this.elements.btnClose.addEventListener('click', () => this.closePanel());
|
||||
this.elements.btnBackToList.addEventListener('click', () => this.showList());
|
||||
this.elements.btnDelete.addEventListener('click', () => this.deleteCurrentRecording());
|
||||
|
||||
// Close on overlay click (but check if it's not settings panel)
|
||||
this.elements.overlay.addEventListener('click', () => {
|
||||
if (!this.elements.panel.classList.contains('hidden')) {
|
||||
this.closePanel();
|
||||
}
|
||||
});
|
||||
},
|
||||
|
||||
/**
|
||||
* Open the recordings panel
|
||||
*/
|
||||
openPanel() {
|
||||
this.elements.panel.classList.remove('hidden');
|
||||
this.elements.overlay.classList.remove('hidden');
|
||||
this.showList();
|
||||
this.loadRecordings();
|
||||
},
|
||||
|
||||
/**
|
||||
* Close the recordings panel
|
||||
*/
|
||||
closePanel() {
|
||||
this.elements.panel.classList.add('hidden');
|
||||
this.elements.overlay.classList.add('hidden');
|
||||
this.currentRecording = null;
|
||||
},
|
||||
|
||||
/**
|
||||
* Show the recordings list view
|
||||
*/
|
||||
showList() {
|
||||
this.elements.recordingsList.classList.remove('hidden');
|
||||
this.elements.recordingViewer.classList.add('hidden');
|
||||
},
|
||||
|
||||
/**
|
||||
* Show the recording viewer
|
||||
*/
|
||||
showViewer() {
|
||||
this.elements.recordingsList.classList.add('hidden');
|
||||
this.elements.recordingViewer.classList.remove('hidden');
|
||||
},
|
||||
|
||||
/**
|
||||
* Load recordings from the API
|
||||
*/
|
||||
async loadRecordings() {
|
||||
this.elements.recordingsList.innerHTML = '<p class="recordings-empty">Loading recordings...</p>';
|
||||
|
||||
try {
|
||||
const response = await fetch('/api/recordings');
|
||||
if (!response.ok) throw new Error('Failed to load recordings');
|
||||
|
||||
this.recordings = await response.json();
|
||||
this.renderRecordingsList();
|
||||
} catch (error) {
|
||||
console.error('Error loading recordings:', error);
|
||||
this.elements.recordingsList.innerHTML =
|
||||
'<p class="recordings-empty">Failed to load recordings</p>';
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Render the recordings list
|
||||
*/
|
||||
renderRecordingsList() {
|
||||
if (this.recordings.length === 0) {
|
||||
this.elements.recordingsList.innerHTML =
|
||||
'<p class="recordings-empty">No recordings yet.<br>Enable auto-save and record some captions!</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
const html = this.recordings.map(recording => `
|
||||
<div class="recording-item" data-filename="${recording.filename}">
|
||||
<div class="recording-info">
|
||||
<span class="recording-date">${recording.date}</span>
|
||||
<span class="recording-meta">${this.formatFileSize(recording.size)}</span>
|
||||
</div>
|
||||
<span class="recording-arrow">›</span>
|
||||
</div>
|
||||
`).join('');
|
||||
|
||||
this.elements.recordingsList.innerHTML = html;
|
||||
|
||||
// Bind click events to items
|
||||
this.elements.recordingsList.querySelectorAll('.recording-item').forEach(item => {
|
||||
item.addEventListener('click', () => {
|
||||
const filename = item.dataset.filename;
|
||||
this.viewRecording(filename);
|
||||
});
|
||||
});
|
||||
},
|
||||
|
||||
/**
|
||||
* Format file size in human-readable format
|
||||
*/
|
||||
formatFileSize(bytes) {
|
||||
if (bytes < 1024) return bytes + ' B';
|
||||
if (bytes < 1024 * 1024) return (bytes / 1024).toFixed(1) + ' KB';
|
||||
return (bytes / (1024 * 1024)).toFixed(1) + ' MB';
|
||||
},
|
||||
|
||||
/**
|
||||
* View a specific recording
|
||||
*/
|
||||
async viewRecording(filename) {
|
||||
try {
|
||||
const response = await fetch(`/api/recordings/${encodeURIComponent(filename)}`);
|
||||
if (!response.ok) throw new Error('Failed to load recording');
|
||||
|
||||
const data = await response.json();
|
||||
this.currentRecording = filename;
|
||||
this.elements.viewerFilename.textContent = filename;
|
||||
this.elements.viewerContent.textContent = data.content;
|
||||
this.showViewer();
|
||||
} catch (error) {
|
||||
console.error('Error loading recording:', error);
|
||||
alert('Failed to load recording');
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Delete the currently viewed recording
|
||||
*/
|
||||
async deleteCurrentRecording() {
|
||||
if (!this.currentRecording) return;
|
||||
|
||||
if (!confirm('Are you sure you want to delete this recording?')) {
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
const response = await fetch(`/api/recordings/${encodeURIComponent(this.currentRecording)}`, {
|
||||
method: 'DELETE'
|
||||
});
|
||||
|
||||
if (!response.ok) throw new Error('Failed to delete recording');
|
||||
|
||||
// Remove from local list
|
||||
this.recordings = this.recordings.filter(r => r.filename !== this.currentRecording);
|
||||
this.currentRecording = null;
|
||||
|
||||
// Go back to list
|
||||
this.showList();
|
||||
this.renderRecordingsList();
|
||||
} catch (error) {
|
||||
console.error('Error deleting recording:', error);
|
||||
alert('Failed to delete recording');
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
// Initialize when DOM is ready
|
||||
document.addEventListener('DOMContentLoaded', () => {
|
||||
Recordings.init();
|
||||
});
|
||||
259
static/js/settings.js
Normal file
259
static/js/settings.js
Normal file
@ -0,0 +1,259 @@
|
||||
/**
|
||||
* Settings Panel Module
|
||||
* Handles user settings UI and persistence
|
||||
*/
|
||||
|
||||
const Settings = {
|
||||
// Current settings state
|
||||
current: {},
|
||||
|
||||
// DOM elements
|
||||
elements: {},
|
||||
|
||||
/**
|
||||
* Initialize the settings module
|
||||
*/
|
||||
init() {
|
||||
this.cacheElements();
|
||||
this.bindEvents();
|
||||
},
|
||||
|
||||
/**
|
||||
* Cache DOM element references
|
||||
*/
|
||||
cacheElements() {
|
||||
this.elements = {
|
||||
panel: document.getElementById('settings-panel'),
|
||||
overlay: document.getElementById('overlay'),
|
||||
btnSettings: document.getElementById('btn-settings'),
|
||||
btnClose: document.getElementById('btn-close-settings'),
|
||||
btnSave: document.getElementById('btn-save-settings'),
|
||||
btnReset: document.getElementById('btn-reset-settings'),
|
||||
|
||||
// Text settings
|
||||
fontFamily: document.getElementById('font-family'),
|
||||
fontSize: document.getElementById('font-size'),
|
||||
fontSizeValue: document.getElementById('font-size-value'),
|
||||
fontWeight: document.getElementById('font-weight'),
|
||||
textColor: document.getElementById('text-color'),
|
||||
textAlign: document.getElementById('text-align'),
|
||||
|
||||
// Background settings
|
||||
backgroundColor: document.getElementById('background-color'),
|
||||
backgroundOpacity: document.getElementById('background-opacity'),
|
||||
opacityValue: document.getElementById('opacity-value'),
|
||||
borderRadius: document.getElementById('border-radius'),
|
||||
radiusValue: document.getElementById('radius-value'),
|
||||
padding: document.getElementById('padding'),
|
||||
paddingValue: document.getElementById('padding-value'),
|
||||
|
||||
// Behavior settings
|
||||
maxWords: document.getElementById('max-words'),
|
||||
maxWordsValue: document.getElementById('max-words-value'),
|
||||
|
||||
// Caption display
|
||||
captionContainer: document.getElementById('caption-container'),
|
||||
};
|
||||
},
|
||||
|
||||
/**
|
||||
* Bind event listeners
|
||||
*/
|
||||
bindEvents() {
|
||||
// Panel open/close
|
||||
this.elements.btnSettings.addEventListener('click', () => this.openPanel());
|
||||
this.elements.btnClose.addEventListener('click', () => this.closePanel());
|
||||
this.elements.overlay.addEventListener('click', () => this.closePanel());
|
||||
|
||||
// Save/Reset
|
||||
this.elements.btnSave.addEventListener('click', () => this.saveSettings());
|
||||
this.elements.btnReset.addEventListener('click', () => this.resetSettings());
|
||||
|
||||
// Live preview on input change
|
||||
const inputs = [
|
||||
'fontFamily', 'fontSize', 'fontWeight', 'textColor', 'textAlign',
|
||||
'backgroundColor', 'backgroundOpacity', 'borderRadius', 'padding',
|
||||
'maxWords'
|
||||
];
|
||||
|
||||
inputs.forEach(name => {
|
||||
const element = this.elements[name];
|
||||
if (element) {
|
||||
element.addEventListener('input', () => this.updatePreview());
|
||||
}
|
||||
});
|
||||
|
||||
// Update value displays for range inputs
|
||||
this.elements.fontSize.addEventListener('input', (e) => {
|
||||
this.elements.fontSizeValue.textContent = e.target.value;
|
||||
});
|
||||
this.elements.backgroundOpacity.addEventListener('input', (e) => {
|
||||
this.elements.opacityValue.textContent = e.target.value;
|
||||
});
|
||||
this.elements.borderRadius.addEventListener('input', (e) => {
|
||||
this.elements.radiusValue.textContent = e.target.value;
|
||||
});
|
||||
this.elements.padding.addEventListener('input', (e) => {
|
||||
this.elements.paddingValue.textContent = e.target.value;
|
||||
});
|
||||
this.elements.maxWords.addEventListener('input', (e) => {
|
||||
this.elements.maxWordsValue.textContent = e.target.value;
|
||||
});
|
||||
},
|
||||
|
||||
/**
|
||||
* Open settings panel
|
||||
*/
|
||||
openPanel() {
|
||||
this.elements.panel.classList.remove('hidden');
|
||||
this.elements.overlay.classList.remove('hidden');
|
||||
},
|
||||
|
||||
/**
|
||||
* Close settings panel
|
||||
*/
|
||||
closePanel() {
|
||||
this.elements.panel.classList.add('hidden');
|
||||
this.elements.overlay.classList.add('hidden');
|
||||
},
|
||||
|
||||
/**
|
||||
* Apply settings to the UI
|
||||
*/
|
||||
applySettings(settings) {
|
||||
this.current = settings;
|
||||
|
||||
// Update form values
|
||||
this.elements.fontFamily.value = settings.font_family;
|
||||
this.elements.fontSize.value = settings.font_size;
|
||||
this.elements.fontSizeValue.textContent = settings.font_size;
|
||||
this.elements.fontWeight.value = settings.font_weight;
|
||||
this.elements.textColor.value = settings.text_color;
|
||||
this.elements.textAlign.value = settings.text_align;
|
||||
|
||||
this.elements.backgroundColor.value = settings.background_color;
|
||||
this.elements.backgroundOpacity.value = Math.round(settings.background_opacity * 100);
|
||||
this.elements.opacityValue.textContent = Math.round(settings.background_opacity * 100);
|
||||
this.elements.borderRadius.value = settings.border_radius;
|
||||
this.elements.radiusValue.textContent = settings.border_radius;
|
||||
this.elements.padding.value = settings.padding;
|
||||
this.elements.paddingValue.textContent = settings.padding;
|
||||
|
||||
this.elements.maxWords.value = settings.max_words || 30;
|
||||
this.elements.maxWordsValue.textContent = settings.max_words || 30;
|
||||
|
||||
// Apply to caption container
|
||||
this.updatePreview();
|
||||
},
|
||||
|
||||
/**
|
||||
* Update live preview of caption styling
|
||||
*/
|
||||
updatePreview() {
|
||||
const container = this.elements.captionContainer;
|
||||
const opacity = this.elements.backgroundOpacity.value / 100;
|
||||
|
||||
// Parse background color and apply opacity
|
||||
const bgColor = this.elements.backgroundColor.value;
|
||||
const r = parseInt(bgColor.slice(1, 3), 16);
|
||||
const g = parseInt(bgColor.slice(3, 5), 16);
|
||||
const b = parseInt(bgColor.slice(5, 7), 16);
|
||||
|
||||
container.style.fontFamily = this.elements.fontFamily.value;
|
||||
container.style.fontSize = `${this.elements.fontSize.value}px`;
|
||||
container.style.fontWeight = this.elements.fontWeight.value;
|
||||
container.style.color = this.elements.textColor.value;
|
||||
container.style.textAlign = this.elements.textAlign.value;
|
||||
container.style.backgroundColor = `rgba(${r}, ${g}, ${b}, ${opacity})`;
|
||||
container.style.borderRadius = `${this.elements.borderRadius.value}px`;
|
||||
container.style.padding = `${this.elements.padding.value}px`;
|
||||
|
||||
// Store max words for caption management
|
||||
this.current.max_words = parseInt(this.elements.maxWords.value);
|
||||
},
|
||||
|
||||
/**
|
||||
* Get current form values as settings object
|
||||
*/
|
||||
getFormValues() {
|
||||
return {
|
||||
font_family: this.elements.fontFamily.value,
|
||||
font_size: parseInt(this.elements.fontSize.value),
|
||||
font_weight: this.elements.fontWeight.value,
|
||||
text_color: this.elements.textColor.value,
|
||||
text_align: this.elements.textAlign.value,
|
||||
background_color: this.elements.backgroundColor.value,
|
||||
background_opacity: this.elements.backgroundOpacity.value / 100,
|
||||
border_radius: parseInt(this.elements.borderRadius.value),
|
||||
padding: parseInt(this.elements.padding.value),
|
||||
max_words: parseInt(this.elements.maxWords.value),
|
||||
};
|
||||
},
|
||||
|
||||
/**
|
||||
* Save settings to server
|
||||
*/
|
||||
async saveSettings() {
|
||||
const settings = this.getFormValues();
|
||||
|
||||
try {
|
||||
const response = await fetch('/api/settings', {
|
||||
method: 'PUT',
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
body: JSON.stringify(settings),
|
||||
});
|
||||
|
||||
if (response.ok) {
|
||||
this.current = await response.json();
|
||||
this.closePanel();
|
||||
console.log('Settings saved');
|
||||
} else {
|
||||
console.error('Failed to save settings');
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error saving settings:', error);
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Reset settings to defaults
|
||||
*/
|
||||
async resetSettings() {
|
||||
if (!confirm('Reset all settings to defaults?')) {
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
const response = await fetch('/api/settings/reset', {
|
||||
method: 'POST',
|
||||
});
|
||||
|
||||
if (response.ok) {
|
||||
const settings = await response.json();
|
||||
this.applySettings(settings);
|
||||
console.log('Settings reset to defaults');
|
||||
} else {
|
||||
console.error('Failed to reset settings');
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error resetting settings:', error);
|
||||
}
|
||||
},
|
||||
|
||||
/**
|
||||
* Fetch settings from server
|
||||
*/
|
||||
async fetchSettings() {
|
||||
try {
|
||||
const response = await fetch('/api/settings');
|
||||
if (response.ok) {
|
||||
const settings = await response.json();
|
||||
this.applySettings(settings);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error fetching settings:', error);
|
||||
}
|
||||
}
|
||||
};
|
||||
159
templates/index.html
Normal file
159
templates/index.html
Normal file
@ -0,0 +1,159 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Live Captions</title>
|
||||
<link rel="stylesheet" href="/static/css/style.css">
|
||||
</head>
|
||||
<body>
|
||||
<div id="app">
|
||||
<!-- Caption Display Area -->
|
||||
<div id="caption-container">
|
||||
<div id="captions"></div>
|
||||
</div>
|
||||
|
||||
<!-- Controls Bar -->
|
||||
<div id="controls">
|
||||
<button id="btn-start" class="btn btn-primary">
|
||||
<span class="icon">►</span> Start
|
||||
</button>
|
||||
<button id="btn-stop" class="btn btn-danger" disabled>
|
||||
<span class="icon">■</span> Stop
|
||||
</button>
|
||||
<button id="btn-clear" class="btn btn-secondary">
|
||||
Clear
|
||||
</button>
|
||||
<label class="toggle-switch" title="Auto-save recordings">
|
||||
<input type="checkbox" id="auto-save-toggle">
|
||||
<span class="toggle-slider"></span>
|
||||
<span class="toggle-label">Auto-save</span>
|
||||
</label>
|
||||
<button id="btn-recordings" class="btn btn-icon" title="Recordings">
|
||||
📋
|
||||
</button>
|
||||
<button id="btn-settings" class="btn btn-icon" title="Settings">
|
||||
⚙
|
||||
</button>
|
||||
</div>
|
||||
|
||||
<!-- Status Indicator -->
|
||||
<div id="status">
|
||||
<span id="status-dot" class="dot"></span>
|
||||
<span id="status-text">Ready</span>
|
||||
</div>
|
||||
|
||||
<!-- Settings Panel -->
|
||||
<div id="settings-panel" class="panel hidden">
|
||||
<div class="panel-header">
|
||||
<h2>Settings</h2>
|
||||
<button id="btn-close-settings" class="btn-close">×</button>
|
||||
</div>
|
||||
<div class="panel-content">
|
||||
<!-- Font Settings -->
|
||||
<div class="setting-group">
|
||||
<h3>Text</h3>
|
||||
|
||||
<label for="font-family">Font Family</label>
|
||||
<select id="font-family">
|
||||
<option value="Arial, sans-serif">Arial</option>
|
||||
<option value="'Helvetica Neue', Helvetica, sans-serif">Helvetica</option>
|
||||
<option value="'Segoe UI', sans-serif">Segoe UI</option>
|
||||
<option value="'Roboto', sans-serif">Roboto</option>
|
||||
<option value="'Open Sans', sans-serif">Open Sans</option>
|
||||
<option value="Georgia, serif">Georgia</option>
|
||||
<option value="'Times New Roman', serif">Times New Roman</option>
|
||||
<option value="'Courier New', monospace">Courier New</option>
|
||||
<option value="monospace">Monospace</option>
|
||||
</select>
|
||||
|
||||
<label for="font-size">Font Size: <span id="font-size-value">32</span>px</label>
|
||||
<input type="range" id="font-size" min="16" max="72" value="32">
|
||||
|
||||
<label for="font-weight">Font Weight</label>
|
||||
<select id="font-weight">
|
||||
<option value="normal">Normal</option>
|
||||
<option value="bold">Bold</option>
|
||||
<option value="lighter">Light</option>
|
||||
</select>
|
||||
|
||||
<label for="text-color">Text Color</label>
|
||||
<input type="color" id="text-color" value="#ffffff">
|
||||
|
||||
<label for="text-align">Text Alignment</label>
|
||||
<select id="text-align">
|
||||
<option value="left">Left</option>
|
||||
<option value="center">Center</option>
|
||||
<option value="right">Right</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<!-- Background Settings -->
|
||||
<div class="setting-group">
|
||||
<h3>Background</h3>
|
||||
|
||||
<label for="background-color">Background Color</label>
|
||||
<input type="color" id="background-color" value="#1a1a2e">
|
||||
|
||||
<label for="background-opacity">Opacity: <span id="opacity-value">90</span>%</label>
|
||||
<input type="range" id="background-opacity" min="0" max="100" value="90">
|
||||
|
||||
<label for="border-radius">Corner Radius: <span id="radius-value">10</span>px</label>
|
||||
<input type="range" id="border-radius" min="0" max="30" value="10">
|
||||
|
||||
<label for="padding">Padding: <span id="padding-value">20</span>px</label>
|
||||
<input type="range" id="padding" min="5" max="50" value="20">
|
||||
</div>
|
||||
|
||||
<!-- Caption Behavior -->
|
||||
<div class="setting-group">
|
||||
<h3>Behavior</h3>
|
||||
|
||||
<label for="max-words">Max Words: <span id="max-words-value">30</span></label>
|
||||
<input type="range" id="max-words" min="1" max="100" value="30">
|
||||
</div>
|
||||
|
||||
<!-- Actions -->
|
||||
<div class="setting-actions">
|
||||
<button id="btn-save-settings" class="btn btn-primary">Save Settings</button>
|
||||
<button id="btn-reset-settings" class="btn btn-secondary">Reset to Defaults</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Recordings Panel -->
|
||||
<div id="recordings-panel" class="panel hidden">
|
||||
<div class="panel-header">
|
||||
<h2>Recordings</h2>
|
||||
<button id="btn-close-recordings" class="btn-close">×</button>
|
||||
</div>
|
||||
<div class="panel-content">
|
||||
<!-- Recordings List -->
|
||||
<div id="recordings-list" class="recordings-list">
|
||||
<p class="recordings-empty">Loading recordings...</p>
|
||||
</div>
|
||||
|
||||
<!-- Recording Viewer -->
|
||||
<div id="recording-viewer" class="recording-viewer hidden">
|
||||
<div class="viewer-header">
|
||||
<button id="btn-back-to-list" class="btn btn-secondary btn-small">← Back</button>
|
||||
<span id="viewer-filename" class="viewer-filename"></span>
|
||||
</div>
|
||||
<div id="viewer-content" class="viewer-content"></div>
|
||||
<div class="viewer-actions">
|
||||
<button id="btn-delete-recording" class="btn btn-danger btn-small">Delete</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Overlay for panels -->
|
||||
<div id="overlay" class="hidden"></div>
|
||||
</div>
|
||||
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/4.7.2/socket.io.min.js"></script>
|
||||
<script src="/static/js/settings.js"></script>
|
||||
<script src="/static/js/recordings.js"></script>
|
||||
<script src="/static/js/app.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
102
transcriber.py
Normal file
102
transcriber.py
Normal file
@ -0,0 +1,102 @@
|
||||
"""
|
||||
Whisper transcription module using faster-whisper.
|
||||
"""
|
||||
|
||||
import os
|
||||
import io
|
||||
import tempfile
|
||||
import logging
|
||||
from faster_whisper import WhisperModel
|
||||
from pydub import AudioSegment
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Global model instance (loaded once)
|
||||
_model = None
|
||||
|
||||
|
||||
def get_model():
|
||||
"""Get or initialize the Whisper model."""
|
||||
global _model
|
||||
|
||||
if _model is None:
|
||||
model_size = os.environ.get('WHISPER_MODEL', 'base')
|
||||
device = os.environ.get('WHISPER_DEVICE', 'cpu')
|
||||
compute_type = os.environ.get('WHISPER_COMPUTE_TYPE', 'int8')
|
||||
|
||||
logger.info(f"Loading Whisper model: {model_size} on {device} ({compute_type})")
|
||||
|
||||
_model = WhisperModel(
|
||||
model_size,
|
||||
device=device,
|
||||
compute_type=compute_type
|
||||
)
|
||||
|
||||
logger.info("Whisper model loaded successfully")
|
||||
|
||||
return _model
|
||||
|
||||
|
||||
def transcribe_audio(audio_bytes, format='webm'):
|
||||
"""
|
||||
Transcribe audio bytes to text.
|
||||
|
||||
Args:
|
||||
audio_bytes: Raw audio data
|
||||
format: Audio format (default: webm)
|
||||
|
||||
Returns:
|
||||
Transcribed text string
|
||||
"""
|
||||
if not audio_bytes:
|
||||
return ""
|
||||
|
||||
try:
|
||||
# Convert audio to WAV format that Whisper expects
|
||||
audio = AudioSegment.from_file(
|
||||
io.BytesIO(audio_bytes),
|
||||
format=format
|
||||
)
|
||||
|
||||
# Convert to 16kHz mono WAV (Whisper's expected format)
|
||||
audio = audio.set_frame_rate(16000).set_channels(1)
|
||||
|
||||
# Export to temporary file (faster-whisper needs a file path)
|
||||
with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as tmp:
|
||||
audio.export(tmp.name, format='wav')
|
||||
tmp_path = tmp.name
|
||||
|
||||
try:
|
||||
# Transcribe
|
||||
model = get_model()
|
||||
segments, info = model.transcribe(
|
||||
tmp_path,
|
||||
beam_size=5,
|
||||
vad_filter=True,
|
||||
vad_parameters=dict(
|
||||
min_silence_duration_ms=500
|
||||
)
|
||||
)
|
||||
|
||||
# Combine all segments into text
|
||||
text = ' '.join(segment.text.strip() for segment in segments)
|
||||
return text.strip()
|
||||
|
||||
finally:
|
||||
# Clean up temp file
|
||||
if os.path.exists(tmp_path):
|
||||
os.unlink(tmp_path)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Transcription error: {e}")
|
||||
return ""
|
||||
|
||||
|
||||
def preload_model():
|
||||
"""Preload the model during startup."""
|
||||
try:
|
||||
get_model()
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to preload model: {e}")
|
||||
return False
|
||||
Loading…
x
Reference in New Issue
Block a user