Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Group delay functions and its applications in speech technology
Research Area: Uncategorized Year: 2011
Type of Publication: Article Keywords: Fourier transform phase; group delay functions; feature extraction from phase; feature switching; mutual information; K-L divergence
Authors: B. Yegnanarayana  
   
Abstract:
Traditionally, the information in speech signals is represented in terms of features derived from short-time Fourier analysis. In this analysis the features extracted from the magnitude of the Fourier transform (FT) are considered, ignor- ing the phase component. Although the significance of the FT phase was highlighted in several studies over the recent three decades, the features of the FT phase were not exploited fully due to difficulty in computing the phase and also in process- ing the phase function. The information in the short-time FT phase function can be extracted by processing the derivative of the FT phase, i.e., the group delay function. In this paper, the properties of the group delay functions are reviewed, highlighting the importance of the FT phase for representing information in the speech signal. Meth- ods to process the group delay function are discussed to capture the characteristics of the vocal-tract system in the form of formants or through a modified group delay function. Applications of group delay functions for speech processing are discussed in some detail. They include segmentation of speech into syllable boundaries, exploiting the additive and high resolution properties of the group delay functions. The effec- tiveness of segmentation of speech, and the features derived from the modified group delay are demonstrated in applications such as language identification, speech recog- nition and speaker recognition. The paper thus demonstrates the need to exploit the potential of the group delay functions for development of speech systems.
Digital version