Speech and Vision Lab

  • Increase font size
  • Default font size
  • Decrease font size
Home Publications
Robustness and Accuracy of Time Delay Estimation in a Live Room
Research Area: Uncategorized Year: 2021
Type of Publication: In Proceedings  
Authors: B. Yegnanarayana, Narayana Murty, JV Satyanarayana, Vishala Pannala, Nivedita Chennupati  
Estimation of time delay from the received broadband signals like speech, collected at two or more spatially distributed microphones, has many applications. Methods like the cross-correlation of the signals directly and generalized crosscorrelation based methods (GCC and GCC-PHAT) have been used for several years to estimate the time delay. Performance of these methods degrades due to noise, multi-path reflections, and reverberation in a practical environment, like a live room. The estimated time delay is usually robust due to the averaging effect of the delay obtained over several frames in an utterance of a few seconds. The robustness is affected if the varying time delay of a moving speaker is desired. A smaller duration for averaging results in errors in the estimation of the time delay, and a longer duration for averaging results in loss of accuracy. Since the single frequency filtering (SFF) based analysis provides an estimation of the instantaneous time delay, it is possible to study the trade off between accuracy and robustness. This paper examines this trade-off in determining the number of stationary speakers from mixed signals and in tracking a speaker moving along a straight line path and along a circular path. The results are illustrated for actual data collected in a live room.
Digital version