Skip to content
Skip to main navigation
Skip to first column
Skip to second column

Speech and Vision Lab

Home

Publications

Robust pitch estimation in noisy speech using ZTW and group delay function

Research Area:	Signal Processing	Year:	2015
Type of Publication:	In Proceedings	Keywords:	HNGD, Pitch, Noisy speech
Authors:	Ravi Shankar Prasad, B. Yegnanarayana






Note:
The proposed algorithm utilizes a speech analysis method called zero-time windowing (ZTW) where the signal is processed using a heavily decaying win- dow, and the spectral characteristics are highlighted using the numerator of the group delay function. The amplitude contour of dominant resonances in the spectra are extracted, and pro- cessed further using a Gaussian window. The resulting contour reflects the energy profile of the signal which is utilized for es- timation of the pitch values. The proposed algorithm is robust to degradations, and has been tested on several utterances with added noises.
Abstract:
Identification of pitch for speech signals recorded in noisy en- vironments is a fundamental and long persistent problem in speech research. Several time domain based techniques attempt to exploit the periodic nature of the waveform using autocorre- lation function and its variants. Other set of techniques utilize the harmonic structure in the spectral domain to identify pitch values. Either of these techniques suffer significant degrada- tion in their performance in cases of noisy speech signals with low SNRs. The paper presents a robust technique to identify pitch values for speech signals. The proposed algorithm utilizes a speech analysis method called zero-time windowing (ZTW) where the signal is processed using a heavily decaying win- dow, and the spectral characteristics are highlighted using the numerator of the group delay function. The amplitude contour of dominant resonances in the spectra are extracted, and pro- cessed further using a Gaussian window. The resulting contour reflects the energy profile of the signal which is utilized for es- timation of the pitch values. The proposed algorithm is robust to degradations, and has been tested on several utterances with added noises. The algorithm exhibits significant increment in performance when compared to existing techniques.
Digital version

Main Menu

Login Form

Who's Online

We have 27 guests online