Skip to content
Skip to main navigation
Skip to first column
Skip to second column

Speech and Vision Lab

Home

People

Discriminating Neutral and Emotional Speech using Neural Networks

Research Area:	Neural Networks	Year:	2014
Type of Publication:	In Proceedings
Authors:	Sudarsana Reddy Kadiri, P. Gangamohan, B. Yegnanarayana






Abstract:
In this paper, we address the issue of speaker-specific emotion detection (neu- tral vs emotion) from speech signals with models for neutral speech as reference. As emotional speech is produced by the hu- man speech production mechanism, the emotion information is expected to lie in the features of both excitation source and the vocal tract system. Linear Prediction residual is used as the excitation source component and Linear Prediction Coef- ficients as the vocal tract system com- ponent. A pitch synchronous analysis is performed. Separate Autoassociative Neural Network models are developed to capture the information specific to neu- tral speech, from the excitation and the vocal tract system components. Exper- imental results show that the excitation source carries more information than the vocal tract system. The accuracy neu- tral vs emotion classification using excita- tion source information is 91%, which is 8% higher than the accuracy obtained us- ing vocal tract system information. The Berlin EMO-DB database is used in this study. It is observed that, the proposed emotion detection system provides an im- provement of approximately 10% using excitation source features and 3% using vocal tract system features over the re- cently proposed emotion detection which uses the energy and pitch contour model- ing with functional data analysis.

Main Menu

Login Form

Who's Online

We have 93 guests online