High-speed video recording of the vocal folds during sustained phonation has become a widespread diagnostic tool, and the development of imaging techniques able to perform automated tracking and analysis of relevant glottal cues, such as folds edge position or glottal area, is an active research field. In this paper, a vocal folds vibration analysis method based on the processing of visual data through a biomechanical model of the layngeal dynamics is proposed. The procedure relies on a Bayesian non-stationary estimation of the biomechanical model parameters and state, to fit the folds edge position extracted from the high-speed video endoscopic data. This finely tuned dynamical model is then used as a state transition model in a Bayesian setting, and it allows to obtain a physiologically motivated estimation of upper and lower vocal folds edge position. Based on model prediction, an hypothesis on the lower fold position can be made even in complete fold occlusion conditions occurring during the end of the closed phase and the beginning of the open phase of the glottal cycle. To demonstrate the suitability of the procedure, the method is assessed on a set of audiovisual recordings featuring high-speed video endoscopic data from healthy subjects producing sustained voiced phonation with different laryngeal settings.

Fitting a biomechanical model of the folds to high-speed video data through bayesian estimation

Drioli C.;Foresti G. L.
2020-01-01

Abstract

High-speed video recording of the vocal folds during sustained phonation has become a widespread diagnostic tool, and the development of imaging techniques able to perform automated tracking and analysis of relevant glottal cues, such as folds edge position or glottal area, is an active research field. In this paper, a vocal folds vibration analysis method based on the processing of visual data through a biomechanical model of the layngeal dynamics is proposed. The procedure relies on a Bayesian non-stationary estimation of the biomechanical model parameters and state, to fit the folds edge position extracted from the high-speed video endoscopic data. This finely tuned dynamical model is then used as a state transition model in a Bayesian setting, and it allows to obtain a physiologically motivated estimation of upper and lower vocal folds edge position. Based on model prediction, an hypothesis on the lower fold position can be made even in complete fold occlusion conditions occurring during the end of the closed phase and the beginning of the open phase of the glottal cycle. To demonstrate the suitability of the procedure, the method is assessed on a set of audiovisual recordings featuring high-speed video endoscopic data from healthy subjects producing sustained voiced phonation with different laryngeal settings.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S2352914819304058-main.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.98 MB
Formato Adobe PDF
2.98 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1188987
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact