Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Exploring features for text-dependent speaker verification in distant speech signals
Research Area: Speech Recognition Year: 2010
Type of Publication: Mastersthesis Keywords: automatic speaker verification, text-dependent, distant speech, high signal-to-noise ratio, pitch, duration
Authors: B. Avinash  
Automatic speaker verification (ASV) is the task of verifying a person’s claimed identity from his/her voice using a digital computer. The existing ASV systems perform with high accuracy of verification when the speech signal is collected close to the mouth of the speaker (peaker, for text-dependent ASV system. The distant speech signal is collected using single channel microphone. An acoustic feature derived from short segments of speech signals is proposed for ASV task. The key idea is to exploit the high signal-to-noise nature of short segments of speech in the vicinity of impulse-like excitations. We demonstrate that the proposed feature suffers lesser degradation with distance when compared to the widely used Mel-frequency cepstral coefficients (MFCCs), and also yields better performance of speaker verification than MFCCs. We propose a method of begin-end detection based on the strength of the spectral peaks. A score normalization method is pro- posed by considering only the robust regions of speech signal. In addition, the regions of speech signal with high signal-to-reverberation ratio are identified, and greater weightage is given to these regions. These modifications are shown to result in a systematic improvement in the performance of the speaker verification system. The use of additional features of duration and pitch is shown to further improve the performance of speaker verification system for distant speech.
Digital version