API & FAQ
REST API on localhost:8000 with WebSocket for real-time events. All endpoints return JSON.
API Endpoints
POST/api/scanStart pipeline scan on selected MXF files
{ "files": ["path/to/file.mxf"], "export_dir": "/output" }
GET/api/scan/statusPoll current pipeline progress (stage, frame, percentage)
POST/api/scan/abortAbort a running pipeline scan — partial results are preserved
GET/api/catalogRetrieve the full clip catalog for the last scan
POST/api/exportExport approved clips and metadata to output directory
{ "clip_ids": ["clip-001", "clip-002"] }
GET/api/statusSystem status: loaded models, VRAM usage, IRC connection, pipeline state
GET/api/healthHealth check — returns 200 if API is running
GET/api/modelsList loaded AI models with VRAM usage and status
GET/api/reportsList all generated scan reports with summaries
GET/api/reports/{file}Fetch a specific report by filename
GET/api/agentsList all agents with current status and last message
GET/api/files/list-dirBrowse directories for MXF file selection
GET/api/frameGet a specific frame as JPEG (query: mxf_path, frame)
GET/api/clip-thumbnailGet clip thumbnail (query: mxf_path, frame, width)
GET/api/clip-stream/{id}Stream clip video for playback
WS/wsWebSocket for real-time pipeline events, agent messages, and progress updates
WebSocket Events
Connect to ws://localhost:8000/ws for real-time updates. Events are batched and sent as JSON arrays.
pipeline_progressStage name, frame number, total frames, percentage
pipeline_stageStage started/completed with timing
agent_statusAgent state change (idle, running, completed)
agent_messageIRC message from an agent
clip_detectedNew clip found with metadata
pipeline_completeScan finished with summary stats
Frequently Asked Questions
Q
What file formats are supported?
MXF (Material Exchange Format) broadcast recordings from EVS, Grass Valley, and other professional broadcast equipment. The pipeline uses ffmpeg for reliable decoding.
Q
What sports are supported?
IBSF winter sliding disciplines: 2-man bobsled, 4-man bobsled, women's bobsled, monobob, men's skeleton, and women's skeleton. Trained on World Cup broadcast footage.
Q
How long does processing take?
5-12 minutes per heat depending on length. A typical heat with 17 athletes completes in about 5.5 minutes. OCR is the primary bottleneck (~57% of total time).
Q
What GPU is required?
NVIDIA GPU with 8GB+ VRAM recommended. The full stack (Qwen3-VL-8B + CLIP + RapidOCR) uses ~5.1GB VRAM during processing. VLM auto-unloads after 5 min idle to free memory.
Q
How does quality scoring work?
Each clip is scored 0.0-1.0 across 6 factors: completeness (has start+finish), athlete name confidence, boundary quality, drama value, visual quality, and data richness. Clips above 85% are auto-approved, 60-85% need review, below 60% are flagged.
Q
What are the 16 scene types?
Start house, push start, active run (upper/mid/lower track), finish area, athlete reaction, replay, slow motion, standings board, start list, results table, ceremony, crowd shot, track overview, transition, preshow, commercial break.
Q
How does the system improve over time?
Training Bot (AGT-008) captures correction pairs when you override a clip decision. At 100+ pairs, it triggers QLoRA fine-tuning of the Qwen3-VL-8B model to improve accuracy on your specific broadcast style.
Q
Can I use this without IRC?
Yes. The web UI and API work independently. IRC is used for agent coordination and optional human-agent chat, but the pipeline runs fine without any IRC client connected.