In the context of Industry 5.0, designing anthropocentric human-robot collaborative applications is essential. Moreover, the ability to share production resources, such as tools, in a safe, ergonomic, and efficient manner is a prerequisite for implementing seamless human-robot collaboration. This work aims to develop a tracking system capable of combining the motion and gaze of an operator to infer his/her intentions during a collaborative task involving a shared tool within a manufacturing context. Wearable inertial sensors, combined with a kinematic model of the human, were used to track the operator’s actions, while eye-tracking glasses were employed to determine his/her gaze point relative to the shared tool. A neural network was trained exploiting wearable sensor data to predict the operator’s intention to perform a task. A practical case study was designed, and the developed tracking system was validated through multiple tests assessing reaction time, stopping distance, and reliability. The results indicate that adopting a multimodal signal approach that combines both tracking solutions significantly improves the system’s performance in inferring the operator’s intentions. Thanks to the combination of the eye-tracking and the motion-tracking system, a progressive reduction in reaction time was observed, i.e. from 2.05 to 1.62 s. Additionally, the failure rate for incorrectly recognizing human actions dropped from 37.5% to 7.5%. Overall, the tracking system correctly recognized the operator’s movements 95% of the time. These improvements enhance safety and ergonomics in collaborative environments, particularly in scenarios involving shared production tools.
Multimodal Intention Recognition for Dynamic Tool Sharing in Anthropocentric Human-Robot Collaborative Applications
Ciaghi D.;Vidoni R.;
2026-01-01
Abstract
In the context of Industry 5.0, designing anthropocentric human-robot collaborative applications is essential. Moreover, the ability to share production resources, such as tools, in a safe, ergonomic, and efficient manner is a prerequisite for implementing seamless human-robot collaboration. This work aims to develop a tracking system capable of combining the motion and gaze of an operator to infer his/her intentions during a collaborative task involving a shared tool within a manufacturing context. Wearable inertial sensors, combined with a kinematic model of the human, were used to track the operator’s actions, while eye-tracking glasses were employed to determine his/her gaze point relative to the shared tool. A neural network was trained exploiting wearable sensor data to predict the operator’s intention to perform a task. A practical case study was designed, and the developed tracking system was validated through multiple tests assessing reaction time, stopping distance, and reliability. The results indicate that adopting a multimodal signal approach that combines both tracking solutions significantly improves the system’s performance in inferring the operator’s intentions. Thanks to the combination of the eye-tracking and the motion-tracking system, a progressive reduction in reaction time was observed, i.e. from 2.05 to 1.62 s. Additionally, the failure rate for incorrectly recognizing human actions dropped from 37.5% to 7.5%. Overall, the tracking system correctly recognized the operator’s movements 95% of the time. These improvements enhance safety and ergonomics in collaborative environments, particularly in scenarios involving shared production tools.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


