Research Area: | Uncategorized | Year: | 2011 | ||||
Type of Publication: | Article | Keywords: | Fourier transform phase; group delay functions; feature extraction from phase; feature switching; mutual information; K-L divergence | ||||
Authors: | B. Yegnanarayana | ||||||
Abstract: | |||||||
Traditionally, the information in speech signals is represented in terms
of features derived from short-time Fourier analysis. In this analysis the features
extracted from the magnitude of the Fourier transform (FT) are considered, ignor-
ing the phase component. Although the significance of the FT phase was highlighted
in several studies over the recent three decades, the features of the FT phase were
not exploited fully due to difficulty in computing the phase and also in process-
ing the phase function. The information in the short-time FT phase function can be
extracted by processing the derivative of the FT phase, i.e., the group delay function.
In this paper, the properties of the group delay functions are reviewed, highlighting the
importance of the FT phase for representing information in the speech signal. Meth-
ods to process the group delay function are discussed to capture the characteristics
of the vocal-tract system in the form of formants or through a modified group delay
function. Applications of group delay functions for speech processing are discussed in
some detail. They include segmentation of speech into syllable boundaries, exploiting
the additive and high resolution properties of the group delay functions. The effec-
tiveness of segmentation of speech, and the features derived from the modified group
delay are demonstrated in applications such as language identification, speech recog-
nition and speaker recognition. The paper thus demonstrates the need to exploit the
potential of the group delay functions for development of speech systems. |
|||||||
Digital version |