Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Non-linear encoding of the excitation source using neural networks for transition mode coding in CELP
Research Area: Speech Coding Year: 2012
Type of Publication: In Proceedings Keywords: speech coding, GCI, neural network, transition mode coding, CELP
Authors: Anand Joseph Xavier M., B. Yegnanarayana  
When a frame suffers erasure, the adaptive code-book at the decoder is no longer in sync with the one at the encoder. When the frame that is erased is a frame following the voice-onset frame, this loss of synchronization of the code-books severely degrades the quality of the decoded speech. This degradation is primarily because no meaningful excitation signal is present in the adaptive codebook. In this paper, an auto-associative neural network (AANN) with a compression layer is used to capture the characteristics of the excitation source around the GCIs. A transition mode frame that differs from the conventional CELP frame without altering the bitrate is proposed to deal with this problem of frame drops during transition regions. In this transition mode frames, the compressed representation of the excitation source around the GCIs obtained through AANNs is used to reconstruct the adaptive codebook at the receiver. It is shown that the proposed method improves the quality of the decoded speech.