Publication - Combining evidence from residual phase and MFCC features for speaker recognition

Home

Combining evidence from residual phase and MFCC features for speaker recognition

Research Area:	Uncategorized	Year:	2006
Type of Publication:	Article	Keywords:	cepstral analysis, error statistics, neural nets, speaker recognition, speech processing EER, MFCC, NIST-2003 database, autoassociative neural network, equal error rate, linear prediction analysis, mel-frequency cepstral coefficient, residual phase, spea
Authors:	K.S.R. Murty, B. Yegnanarayana




Abstract:
The objective of this letter is to demonstrate the complementary nature of speaker-specific information present in the residual phase in comparison with the information present in the conventional mel-frequency cepstral coefficients (MFCCs). The residual phase is derived from speech signal by linear prediction analysis. Speaker recognition studies are conducted on the NIST-2003 database using the proposed residual phase and the existing MFCC features. The speaker recognition system based on the residual phase gives an equal error rate (EER) of 22%, and the system using the MFCC features gives an EER of 14%. By combining the evidence from both the residual phase and the MFCC features, an EER of 10.5% is obtained, indicating that speaker-specific excitation information is present in the residual phase. This information is useful since it is complementary to that of MFCCs.
Digital version

Speech and Vision Lab