Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Combining evidence from residual phase and MFCC features for speaker recognition
Research Area: Uncategorized Year: 2006
Type of Publication: Article Keywords: cepstral analysis, error statistics, neural nets, speaker recognition, speech processing EER, MFCC, NIST-2003 database, autoassociative neural network, equal error rate, linear prediction analysis, mel-frequency cepstral coefficient, residual phase, spea
Authors: K.S.R. Murty, B. Yegnanarayana  
   
Abstract:
The objective of this letter is to demonstrate the complementary nature of speaker-specific information present in the residual phase in comparison with the information present in the conventional mel-frequency cepstral coefficients (MFCCs). The residual phase is derived from speech signal by linear prediction analysis. Speaker recognition studies are conducted on the NIST-2003 database using the proposed residual phase and the existing MFCC features. The speaker recognition system based on the residual phase gives an equal error rate (EER) of 22%, and the system using the MFCC features gives an EER of 14%. By combining the evidence from both the residual phase and the MFCC features, an EER of 10.5% is obtained, indicating that speaker-specific excitation information is present in the residual phase. This information is useful since it is complementary to that of MFCCs.
Digital version