About Anulekhika Automatic Speech Recognition System

Anulekhika is a web based automatic speech recognition tool that works for all the Indian languages. This web application has been developed by Linguistic Data Consortium for Indian Languages at CIIL, Mysore. The tool allows users to speak or upload an audio file and get the corresponding text transcribed in the desired Indian languages.

Anulekhika ASR model has been developed at LDC-IL, CIIL, Mysore by a team of linguists, machine learning experts and web developers. The current ASR model has been pooled in from Facebook/Meta (Pratap et.al) from their project named Massively Multilingual Speech (MMS) project. The architecture of this model uses modularised adapter modules that clip on to the main architecture of the ASR model, giving a more robust and accurate multilingual performance. Additionally, the inference pipeline allows the model to infer incoming audio requests above its 30s limit and scaling inference requests to multiple hours. The model is hosted using flash attention 2 on a custom written fastAPI inference pipeline that has dynamic batching and queue based multiprocessor.

Also, to get the transcription of any pre-recorded speech, please click on the browse file button, select the audio file, click on the transcribe button, and get the corresponding transcription in the text box area.

The ASR model supports inference for the following 16 Indian languages:

Kannada, Hindi, Malayalam, Awadhi, Marathi, Bodo Parja, English, Assamese, Tamil, Telugu, Bengali, Gujarati, Haryanvi, Manipur, Marwari, Sindhi

Apart from this, we also showcase one of the ASR systems developed from scratch for Maithili as part of a PhD work undertaken by one of the scholars. We would be adding other language ASR systems as we grow. The Maithili ASR system is a separate interface available here.

Limitations

The tool is hosted on a system with limited resources. Therefore, it is taking an audio input file of only upto 50MB

Please do not record more than 1 minute of audio in one go as the size may go more than 50MB and may result in errors in processing and uploading.

The tool does not take into account punctuations and punctuation marks are not recognized. So, please do not treat it as an optimized dictation tool. Optimized dictation tool for Indian languages in the pipeline.

Disclaimer/Notice

We keep a record the speech being recorded on this system. By using this tool, you agree to give your consent to keep the audio uploaded or recorded by you. We may not know your name and other personal details, but your voice will remain with us for further research and development.