Live Captioning for Gurbani Kirtan

A benchmark for systems that identify which line of Gurbani is being sung, in real time.

Baseline visualizations

Each page shows GT vs Pred vs Diff for every second of audio. Play audio and hover to see the canonical Gurmukhi line at each moment.

Shabad-aware (oracle)Shabad-unaware (blind)
Offline easiest โ€” line alignment only identify Shabad, then align
Live follow along in real time hardest โ€” identify & follow live

A prototype for the hardest variant (live + blind) is running at bani.karanbirsingh.com and scores ~65–70% — per-case visualizations in the blog post.

Synthetic baselines
BaselineDescriptionScore
empty No predictions โ€” silent everywhere 26.0% view โ†’
shifted_5s Ground truth delayed by 5 seconds 85.5% view โ†’
perfect Exact copy of ground truth 100.0% view โ†’

Get involved

Get in touch: admin@karanbirsingh.com

Get started on GitHub โ†’