Research Area: | Speech Synthesis | Year: | 2011 | ||||
Type of Publication: | In Proceedings | Keywords: | speech segmentation, under-resourced language, landmarks | ||||
Authors: | Vijayaditya P., Kishore S. Prahallad | ||||||
Abstract: | |||||||
High accuracy speech segmentation methods invariably depend
on manually labelled data. However under-resourced languages
do not have annotated speech corpora required for training these
segmentors. In this paper we propose a boundary refinement
technique which uses knowledge of phone-class specific subband
energy events, in place of manual labels, to guide the refinement
process. The use of this knowledge enables proper
placement of boundaries in regions with multiple spectral discontinuities
in close proximity. It also helps in the correction
of large alignment errors. The proposed refinement technique
provides boundaries with an accuracy of 82% within 20ms of
actual boundary. Combining the proposed technique with iterative
isolated HMM training technique boosts the accuracy to
89%, without the use of any manually labelled data. |
|||||||
Digital version |