Speech and Vision Lab

Speech Synthesis using Artificial Neural Networks
Research Area: Speech Synthesis Year: 2010
Type of Publication: In Proceedings  
Authors: Veera Raghavendra Elluru, Vijayaditya P., Kishore S. Prahallad  
Statistical parametric synthesis becoming more popular in recent years due to its adaptability and size of the synthesis. Mel cepstral coefficients, fundamental frequency (f0) and duration are the main components for synthesizing speech in statistical parametric synthesis. The current study mainly concentrates on mel cesptral coefficients. Durations and f0 are taken from the original data. In this paper, we are attempting on two fold problem. First problem is how to predict mel cepstral coefficient from text using artificial neural networks. The second problem is predicting formants from the text.
Digital version