Research Area: | Speech Analysis | Year: | 2010 | ||||
Type of Publication: | In Proceedings | ||||||
Authors: | Sri Harish Reddy M., Kishore S. Prahallad, Suryakanth V Gangashetty, B. Yegnanarayana | ||||||
Abstract: | |||||||
For speaker recognition studies, it is necessary to process the
speech signal suitably to capture the speaker-specific information.
There is complementary speaker-specific information
in the excitation source and vocal tract system characteristics.
Therefore it is necessary to separate these components, even
approximately, from the speech signal. We propose linear prediction
(LP) residual and LP coefficients to represent these two
components. Analysis is performed in a pitch synchronous
manner in order to focus on the significant portion of the speech
signal in each glottal cycle, and also to reduce the artifacts of
digital signal processing on the extracted features. Finally, the
speaker-specific information is captured from the excitation and
the vocal tract system components using autoassociative neural
networks (AANN) models. We show that the pitch synchronous
extraction of information from the residual and vocal tract system
bring out the speaker-specific information much better than
using the pitch asynchronous analysis as in the traditional block
processing using an analysis window of fixed size. |
|||||||
Digital version |