Research Area: | Speech Synthesis | Year: | 2010 | ||||
Type of Publication: | In Proceedings | ||||||
Authors: | Veera Raghavendra Elluru, Vijayaditya P., Kishore S. Prahallad | ||||||
Abstract: | |||||||
Statistical parametric synthesis becoming more popular in recent
years due to its adaptability and size of the synthesis. Mel cepstral
coefficients, fundamental frequency (f0) and duration are the main
components for synthesizing speech in statistical parametric synthesis.
The current study mainly concentrates on mel cesptral coefficients.
Durations and f0 are taken from the original data. In this
paper, we are attempting on two fold problem. First problem is how
to predict mel cepstral coefficient from text using artificial neural
networks. The second problem is predicting formants from the text. |
|||||||
Digital version |