Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Voice Activity Detection using Maximum Spectral Amplitude in Sub-bands
Research Area: Signal Processing Year: 2014
Type of Publication: In Proceedings  
Authors: Sivanand Achanta, Nivedita Chennupati, Vishala Pannala, Mansi R, Kishore S. Prahallad  
A robust voice activity detection (VAD) is a prerequisite for many speech based ap- plications like speech recognition. We in- vestigated two VAD techniques that use time domain and frequency domain char- acteristics of speech signal. The temporal characteristic of the autocorrelation lag is able to discriminate speech and nonspeech regions. In the frequency domain, peak value of the magnitude spectrum in differ- ent sub-bands is used for VAD as it varies slowly with time for speech regions when compared to noise. Performance of the proposed methods are evaluated on TIMIT database with noises from NOISEX-92 database at var- ious signal-to-noise ratio (SNR) levels. From the experimental results, it is ob- served that VAD based on autocorrelation lag is working consistently better than the maximum peak value of the autocorrela- tion function based method. However, it performs inferior compared to our sec- ond approach and AMR-VAD2. Our sec- ond approach i.e., VAD based on maxi- mum spectral amplitude in sub-bands out- performs AMR-VAD2 and Sohn VAD for some noise conditions. Moreover, it is shown that a threshold independent of noises and their levels can be selected in the proposed method.