Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Modeling durations of syllables using neural networks
Research Area: Uncategorized Year: 2007
Type of Publication: Article  
Authors: K. Sreenivasa Rao, B. Yegnanarayana  
In this paper, we propose a neural network model for predicting the durations of syllables. A four layer feedforward neural network trained with backpropagation algorithm is used for modeling the duration knowledge of syllables. Broadcast news data in three Indian languages Hindi, Telugu and Tamil is used for this study. The input to the neural network consists of a set of features extracted from the text. These features correspond to phonological, positional and contextual information. The relative importance of the positional and contextual features is examined separately. For improving the accuracy of prediction, further processing is done on the predicted values of the durations. We also propose a two-stage duration model for improving the accuracy of prediction. From the studies we find that 85% of the syllable durations could be predicted from the models within 25% of the actual duration. The performance of the duration models is evaluated using objective measures such as average prediction error ([mu]), standard deviation ([sigma]) and correlation coefficient ([gamma]).
Digital version