In spite of their clear audibility, auditory depth cues have been shown to add generally imprecise information to a 3D scene description. We hypothesize that, conversely, this information becomes salient when a scene is reproduced with low visual resolution. For this purpose, a system has been realized by assembling inexpensive audio-visual reproduction technologies together. The system forms a 3D visual scene from two screen images that are polarized orthogonally before reaching the observer, who wears polarized glasses. In parallel, two small loudspeakers are arranged in stereo dipole configuration to create a binaural hot-spot using a cross-talk cancellation solution. Sounds and images are recorded from a real scene using a stereo camera and a pair of microphones, mounted together to capture average anthropometric inter-eye and inter-aural distances. Based on this system, we have measured that the use of binaural instead of monophonic feedback significantly improves the precision of participants who were asked to guess the time-to-passage of a ball rolling down toward them along a rectilinear trajectory. Preliminary results suggest that the binaural rolling sounds coming from the ball approaching the listener were proficiently employed by participants to improve their guess.

Importance of binaural cues of depth in low-resolution audio-visual 3D scene reproductions

Salvati, Daniele;Drioli, Carlo;Fontana, Federico;Foresti, Gian Luca
2018-01-01

Abstract

In spite of their clear audibility, auditory depth cues have been shown to add generally imprecise information to a 3D scene description. We hypothesize that, conversely, this information becomes salient when a scene is reproduced with low visual resolution. For this purpose, a system has been realized by assembling inexpensive audio-visual reproduction technologies together. The system forms a 3D visual scene from two screen images that are polarized orthogonally before reaching the observer, who wears polarized glasses. In parallel, two small loudspeakers are arranged in stereo dipole configuration to create a binaural hot-spot using a cross-talk cancellation solution. Sounds and images are recorded from a real scene using a stereo camera and a pair of microphones, mounted together to capture average anthropometric inter-eye and inter-aural distances. Based on this system, we have measured that the use of binaural instead of monophonic feedback significantly improves the precision of participants who were asked to guess the time-to-passage of a ball rolling down toward them along a rectilinear trajectory. Preliminary results suggest that the binaural rolling sounds coming from the ball approaching the listener were proficiently employed by participants to improve their guess.
2018
978-1-5386-5713-3
File in questo prodotto:
File Dimensione Formato  
08577121-1official.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Documento in Post-print
Licenza: Non pubblico
Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1142569
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact