API & FAQ

REST API on localhost:8000 with WebSocket for real-time events. All endpoints return JSON.

API Endpoints

POST/api/scanStart pipeline scan on selected MXF files

{ "files": ["path/to/file.mxf"], "export_dir": "/output" }

GET/api/scan/statusPoll current pipeline progress (stage, frame, percentage)

POST/api/scan/abortAbort a running pipeline scan — partial results are preserved

GET/api/catalogRetrieve the full clip catalog for the last scan

POST/api/exportExport approved clips and metadata to output directory

{ "clip_ids": ["clip-001", "clip-002"] }

GET/api/statusSystem status: loaded models, VRAM usage, IRC connection, pipeline state

GET/api/healthHealth check — returns 200 if API is running

GET/api/modelsList loaded AI models with VRAM usage and status

GET/api/reportsList all generated scan reports with summaries

GET/api/reports/{file}Fetch a specific report by filename

GET/api/agentsList all agents with current status and last message

GET/api/files/list-dirBrowse directories for MXF file selection

GET/api/frameGet a specific frame as JPEG (query: mxf_path, frame)

GET/api/clip-thumbnailGet clip thumbnail (query: mxf_path, frame, width)

GET/api/clip-stream/{id}Stream clip video for playback

WS/wsWebSocket for real-time pipeline events, agent messages, and progress updates

WebSocket Events

Connect to ws://localhost:8000/ws for real-time updates. Events are batched and sent as JSON arrays.

pipeline_progressStage name, frame number, total frames, percentage

pipeline_stageStage started/completed with timing

agent_statusAgent state change (idle, running, completed)

agent_messageIRC message from an agent

clip_detectedNew clip found with metadata

pipeline_completeScan finished with summary stats

Frequently Asked Questions

What file formats are supported?

MXF (Material Exchange Format) broadcast recordings from EVS, Grass Valley, and other professional broadcast equipment. The pipeline uses ffmpeg for reliable decoding.

What sports are supported?

IBSF winter sliding disciplines: 2-man bobsled, 4-man bobsled, women's bobsled, monobob, men's skeleton, and women's skeleton. Trained on World Cup broadcast footage.

How long does processing take?

5-12 minutes per heat depending on length. A typical heat with 17 athletes completes in about 5.5 minutes. OCR is the primary bottleneck (~57% of total time).

What GPU is required?

NVIDIA GPU with 8GB+ VRAM recommended. The full stack (Qwen3-VL-8B + CLIP + RapidOCR) uses ~5.1GB VRAM during processing. VLM auto-unloads after 5 min idle to free memory.

How does quality scoring work?

Each clip is scored 0.0-1.0 across 6 factors: completeness (has start+finish), athlete name confidence, boundary quality, drama value, visual quality, and data richness. Clips above 85% are auto-approved, 60-85% need review, below 60% are flagged.

What are the 16 scene types?

Start house, push start, active run (upper/mid/lower track), finish area, athlete reaction, replay, slow motion, standings board, start list, results table, ceremony, crowd shot, track overview, transition, preshow, commercial break.

How does the system improve over time?

Training Bot (AGT-008) captures correction pairs when you override a clip decision. At 100+ pairs, it triggers QLoRA fine-tuning of the Qwen3-VL-8B model to improve accuracy on your specific broadcast style.

Can I use this without IRC?

Yes. The web UI and API work independently. IRC is used for agent coordination and optional human-agent chat, but the pipeline runs fine without any IRC client connected.

Request Access →Back to Guide Index

← Back to Docs