Research Area: | Uncategorized | Year: | 1998 | ||||
Type of Publication: | Article | Keywords: | approximation theory, harmonic analysis, iterative methods, noise, poles and zeros, prediction theory, random processes, spectral-domain analysis, speech processing, speech synthesis, time-varying systemsall-pole model, aperiodic components, deterministic | ||||
Authors: | B. Yegnanarayana, C. d'Alessandro, V. Darsinos | ||||||
Abstract: | |||||||
The speech signal may be considered as the output of a time-varying vocal tract system excited with quasiperiodic and/or random sequences of pulses. The quasiperiodic part may be considered as the deterministic or periodic component and the random part as the stochastic or aperiodic component of the excitation. We discuss issues involved in identifying and separating the periodic and aperiodic components of the source. The decomposition is performed on an approximation to the excitation signal, instead of decomposing the speech signal directly. The linear prediction residual signal is used as an approximation to the excitation signal of the vocal tract system. Speech is first analyzed to determine the voiced and unvoiced parts of the signal. Decomposition of the voiced part into periodic and aperiodic components is then accomplished by first identifying the frequency regions of harmonic and noise components in the spectral domain. The signal corresponding to the noise regions is used as a first approximation to the aperiodic component. An iterative algorithm is proposed which reconstructs the aperiodic component in the harmonic regions. The periodic component is obtained by subtracting the reconstructed aperiodic component signal from the residual signal. The individual components of the residual are then used to excite the derived all-pole model of the vocal tract system to obtain the corresponding components of the speech signal. Experiments were conducted using synthetic speech. They demonstrated the ability of the algorithm for decomposition of a synthetic speech signal made of a mixture of periodic and aperiodic components. Application to natural speech is also discussed |
|||||||
Digital version |