Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Significance of Pitch Synchronous Analysis for Speaker Recognition using AANN Models
Research Area: Speech Analysis Year: 2010
Type of Publication: In Proceedings  
Authors: Sri Harish Reddy M., Kishore S. Prahallad, Suryakanth V Gangashetty, B. Yegnanarayana  
Book title: Interspeech 2010
Address: Makuhari, Japan
Month: September 26-30
   
Abstract:
For speaker recognition studies, it is necessary to process the speech signal suitably to capture the speaker-specific information. There is complementary speaker-specific information in the excitation source and vocal tract system characteristics. Therefore it is necessary to separate these components, even approximately, from the speech signal. We propose linear prediction (LP) residual and LP coefficients to represent these two components. Analysis is performed in a pitch synchronous manner in order to focus on the significant portion of the speech signal in each glottal cycle, and also to reduce the artifacts of digital signal processing on the extracted features. Finally, the speaker-specific information is captured from the excitation and the vocal tract system components using autoassociative neural networks (AANN) models. We show that the pitch synchronous extraction of information from the residual and vocal tract system bring out the speaker-specific information much better than using the pitch asynchronous analysis as in the traditional block processing using an analysis window of fixed size.
Digital version