Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Research Activities

IIIT-H Indic Speech Databases

The IIIT-H Indic speech databases were developed at Speech and Vision Lab, IIIT-H for the purpose of building speech synthesis systems in Indian languages. 


Currently the IIIT-H Indic speech databases consist of  text and speech data in Bengali, Hindi, Kannada, Malayalam, Marathi, Tamil and Telugu. These languages were chosen, as the total number of Wikipedia articles in each of these languages was more than 10,000 and native speakers of these languages were available in the campus. Each of these languages have several dialects. As an initial approximation, we chose to record the speech in the dialect in which the native speaker was comfortable with.

Text Data 

We used Wikipedia articles in Indian languages as our text corpus. A set of 1000 sentences was selected for each language. These sentences were selected to cover 5000 most frequent words in text corpus of the corresponding language. The text data is made available in IT3 (a transliteration scheme) as well as in Unicode (UTF-8 format). 

Speech Recording

The speech data was recorded by a native speaker of the language. The recording was done in a studio environment using a standard headset microphone connected to a Zoom handy recorder. We used a handy recorder as it was highly mobile and easy to operate. By using a headset the distance from the microphone to a mouth and recording level was kept constant.


Click here to download the speech databases and synthetic voices.  


Kishore Prahallad, E. Naresh Kumar, Venkatesh Keri, S. Rajendran and Alan W Black "The IIIT-H Indic Speech Databases", in Proceedings of Interspeech 2012, Portland, Oregon, USA. PDF


Please write to This e-mail address is being protected from spambots. You need JavaScript enabled to view it for any queries/requests on this database.


Login Form

Who's Online

We have 13 guests online