The topic of this work is an extension of our previous research on the development of a general data-driven procedure for creating a neutral "narrative-style" prosodic module for the Italian FESTIVAL Text-To-Speech (TTS) synthesizer, and it is focused on investigating and implementing new strategies for building a new emotional FESTIVAL TTS. The new emotional prosodic modules, similarly to the neutral case, are still based on the "Classification And Regression Tree" (CART) theory. The extension to the emotional speech synthesis is obtained using a differential approach: the emotional prosodic modules learn the differences between the neutral (without emotions) and the emotional prosodic data. Moreover, due to the fact that Voice Quality (VQ) is known to play an important role in emotive speech, a rule-based FESTIVAL-MBROLA VQ-modification module, for control of temporal and spectral characteristics of the synthesis, has also been implemented. Even if emotional synthesis still remains an attractive open issue, our preliminary evaluation results underline the effectiveness of the proposed solution.

Emotional FESTIVAL-MBROLA TTS synthesis

DRIOLI, Carlo;
2005-01-01

Abstract

The topic of this work is an extension of our previous research on the development of a general data-driven procedure for creating a neutral "narrative-style" prosodic module for the Italian FESTIVAL Text-To-Speech (TTS) synthesizer, and it is focused on investigating and implementing new strategies for building a new emotional FESTIVAL TTS. The new emotional prosodic modules, similarly to the neutral case, are still based on the "Classification And Regression Tree" (CART) theory. The extension to the emotional speech synthesis is obtained using a differential approach: the emotional prosodic modules learn the differences between the neutral (without emotions) and the emotional prosodic data. Moreover, due to the fact that Voice Quality (VQ) is known to play an important role in emotive speech, a rule-based FESTIVAL-MBROLA VQ-modification module, for control of temporal and spectral characteristics of the synthesis, has also been implemented. Even if emotional synthesis still remains an attractive open issue, our preliminary evaluation results underline the effectiveness of the proposed solution.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/696187
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? ND
social impact