Skip to content
Skip to main navigation
Skip to first column
Skip to second column

Speech and Vision Lab

Home

Publications

Semi-Supervised Learning of Acoustic Driven Prosodic Phrase Breaks for Text-to-Speech Systems

Research Area:	Speech Synthesis	Year:	2010
Type of Publication:	In Proceedings	Keywords:	speech synthesis, acoustic driven phrasing, semisupervised
Authors:	Kishore S. Prahallad, Veera Raghavendra Elluru, Alan W. Black






Abstract:
In this paper, we propose a semi-supervised learning of acoustic driven phrase breaks and its usefulness for text-to-speech systems. In this work, we derive a set of initial hypothesis of phrase breaks in a speech signal using pause as an acoustic cue. As these initial estimates are obtained based on knowledge of speech production and speech signal processing, one could treat the hypothesized phrase break regions as labeled data. Features such as duration, F0 and energy are extracted from these labeled regions and a machine learning model is trained to perform the classification of these acoustic features as belonging to the class of a phrase break or not a phrase break. We then attempt to bootstrap the machine learning model using unlabeled data (i.e., the rest of the data).
Digital version

Main Menu

Login Form

Who's Online

We have 25 guests online