Skip to content
Skip to main navigation
Skip to first column
Skip to second column

Speech and Vision Lab

Home

Publications

Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system

Research Area:	Uncategorized	Year:	2005
Type of Publication:	Article	Keywords:	neural nets, speaker recognition dynamic time warping technique, fixed-text speaker verification system, neural network models, source feature, spectral features, suprasegmental feature
Authors:	B. Yegnanarayana, S.R.M. Prasanna, J.M. Zachariah, C.S. Gupta




Abstract:
This paper proposes a text-dependent (fixed-text) speaker verification system which uses different types of information for making a decision regarding the identity claim of a speaker. The baseline system uses the dynamic time warping (DTW) technique for matching. Detection of the end-points of an utterance is crucial for the performance of the DTW-based template matching. A method based on the vowel onset point (VOP) is proposed for locating the end-points of an utterance. The proposed method for speaker verification uses the suprasegmental and source features, besides spectral features. The suprasegmental features such as pitch and duration are extracted using the warping path information in the DTW algorithm. Features of the excitation source, extracted using the neural network models, are also used in the text-dependent speaker verification system. Although the suprasegmental and source features individually may not yield good performance, combining the evidence from these features seem to improve the performance of the system significantly. Neural network models are used to combine the evidence from multiple sources of information.
Digital version

Main Menu

Login Form

Who's Online

We have 24 guests online