Synthesis of the Voice Source Using a Physically-Informed Model of the Glottis

Avanzini, F; Drioli, Carlo; Alku, P.

A physically-informed glottal model is proposed; some physical information is retained in a linear block that accounts for fold mechanics, while non-linear coupling with the airflow is modeled using a regressor-based mapping. The model is used in an identification/resynthesis scheme. Given a real signal, system parameters are estimated via non-linear identification techniques; then the model is used for resynthesizing the signal. With a proper choice of the regressor set the system accurately fits the target waveform and is stable during resynthesis. Physical parameters can be used to change voice quality and speaker identity.