Physical retail stores increasingly integrate computer vision (CV) and deep learning (DL) technologies to analyze customer behavior and bridge the gap with data-rich online shopping experiences. Traditional methods for monitoring in-store activities—such as sales data analysis, direct observation, and surveys—are often labor-intensive, imprecise, and lack real-time insights. Recent advances in DL enable automated, large-scale tracking of customer movements and interactions, allowing retailers to make data-driven decisions on store layout, product placement, and personalized services. However, analyzing in-store behavior remains challenging due to occlusions, dynamic movements, and the limitations of single-view camera setups. To address these challenges, we introduce E-EYE, a novel DL-driven framework leveraging EGO-EXO vision, which fuses first-person (egocentric) and third-person (exocentric) camera perspectives to capture both fine-grained customer-object interactions and broader in-store activities. The proposed system consists of three key modules: (i) a First-Person View (FPV) Module, utilizing wearable cameras to capture detailed hand-object interactions; (ii) a Third-Person View (TPV) Module, integrating fixed environmental cameras for people detection, tracking, and scene-wide activity recognition; and (iii) an Object Tracking Module (OBT), ensuring long-term tracking of products and shopping tools across views. By combining these perspectives, our approach mitigates occlusion issues, enhances tracking robustness, and provides a global-to-local understanding of customer behavior. Our framework enables advanced retail analytics, offering insights into product engagement and customer trajectories throughout the store. It is scalable, adaptable to various retail environments, and deployable on cloud-based infrastructures. Experimental results demonstrate effectiveness of E-EYE, highlighting its potential to revolutionize in-store customer behavior analysis.
Ego and Exo Views for an Object-Level Human Behavior Analysis and Understanding Through Tracking in Retail Spaces
Dunnhofer M.;Micheloni C.;
2026-01-01
Abstract
Physical retail stores increasingly integrate computer vision (CV) and deep learning (DL) technologies to analyze customer behavior and bridge the gap with data-rich online shopping experiences. Traditional methods for monitoring in-store activities—such as sales data analysis, direct observation, and surveys—are often labor-intensive, imprecise, and lack real-time insights. Recent advances in DL enable automated, large-scale tracking of customer movements and interactions, allowing retailers to make data-driven decisions on store layout, product placement, and personalized services. However, analyzing in-store behavior remains challenging due to occlusions, dynamic movements, and the limitations of single-view camera setups. To address these challenges, we introduce E-EYE, a novel DL-driven framework leveraging EGO-EXO vision, which fuses first-person (egocentric) and third-person (exocentric) camera perspectives to capture both fine-grained customer-object interactions and broader in-store activities. The proposed system consists of three key modules: (i) a First-Person View (FPV) Module, utilizing wearable cameras to capture detailed hand-object interactions; (ii) a Third-Person View (TPV) Module, integrating fixed environmental cameras for people detection, tracking, and scene-wide activity recognition; and (iii) an Object Tracking Module (OBT), ensuring long-term tracking of products and shopping tools across views. By combining these perspectives, our approach mitigates occlusion issues, enhances tracking robustness, and provides a global-to-local understanding of customer behavior. Our framework enables advanced retail analytics, offering insights into product engagement and customer trajectories throughout the store. It is scalable, adaptable to various retail environments, and deployable on cloud-based infrastructures. Experimental results demonstrate effectiveness of E-EYE, highlighting its potential to revolutionize in-store customer behavior analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


