|Research Area:||Signal Processing||Year:||2009|
|Type of Publication:||In Proceedings||Keywords:||linear prediction, neural network, spectral map- ping, speech coding, throat microphone, vector quantization|
|Authors:||Anand Joseph Xavier M., B. Yegnanarayana, Sanjeev Gupta, R.M. Kesheorey|
|Book title:||Proc. INTERSPEECH 2009|
Throat microphones (TM) which are robust to background noise can be used in environments with high levels of background noise. Speech collected using TM is perceptually less natural. The objective of this paper is to map the spectral features (repre- sented in the form of cepstral features) of TM and close speak- ing microphone (CSM) speech to improve the former’s percep- tual quality, and to represent it in an efficient manner for coding. The spectral mapping of TM and CSM speech is done using a multilayer feed-forward neural network, which is trained from features derived from TM and CSM speech. The sequence of estimated CSM spectral features is quantized and coded as a sequence of codebook indices using vector quantization. The sequence of codebook indices, the pitch contour and the energy contour derived from the TM signal are used to store/transmit the TM speech information efficiently. At the receiver, the all- pole system corresponding to the estimated CSM spectral vec- tors is excited by a synthetic residual to generate the speech signal.