About This Tool
This experimental tool uses a multilingual ASR (Automatic Speech
Recognition) model to generate transcripts from audio and video in
Balkan languages. Its purpose is to make transcription technology
more accessible for under-resourced languages and to support the
long-term preservation and documentation of linguistic diversity.
⚠️
Important Notice
This is experimental software. Although we aim for
high accuracy, the transcripts will not always be perfect. You
should always review and, where needed, correct the output to ensure
it is reliable for your specific use case.
Supported Languages
In this public demo, transcription is currently available for the
following languages: Serbian,
Macedonian, Bulgarian,
Slovene, Croatian,
Bosnian and Romanian. For these
languages, we target a word error rate below 10%, providing a solid
user experience while focusing on Balkan languages that can benefit
the most from improved transcription tools.
Preserving Linguistic Heritage
Many languages worldwide still lack robust digital transcription
tools. By improving ASR models for low-resource languages, we help
protect cultural and linguistic heritage and make digital content
more searchable, reusable and accessible to a wider range of
communities.