Interactive agents are an essential element of many persuasive applications. Their design and development have so far required extensive human effort to model their appearance and behavior. However, recent advances in the generative capabilities of Large Language Models (LLMs) might pave the way to build persuasive agents capable of autonomous, open-ended interactions without requiring the traditional investment in agent development. In this paper, we investigate the creation of an LLM-based embodied agent aimed at interacting with users in real-time to coach them in performing slow and deep breathing. In the approach we followed, the LLM uses a text-based context to generate a composition of predefined behaviors for interacting with the user through both verbal and nonverbal communication. The text-based context provided to the LLM described essential details, like the user’s respiratory rate, to monitor the exercise. Information about actual user’s breathing was provided to the LLM-model through a physiological sensor. The LLM-based breathing coach managed to follow the exercise structure and generated believable contingent behavior compositions. However, as we describe in the paper, building and evaluating the system allowed to highlight limitations of using only LLMs to create agents capable of real-time user interactions. The identified limitations suggest a need for hybrid approaches.
Exploring the Potential and Limitations of Large Language Models to Control the Behavior of Embodied Persuasive Agents
Corrò C.
;Chittaro L.
2025-01-01
Abstract
Interactive agents are an essential element of many persuasive applications. Their design and development have so far required extensive human effort to model their appearance and behavior. However, recent advances in the generative capabilities of Large Language Models (LLMs) might pave the way to build persuasive agents capable of autonomous, open-ended interactions without requiring the traditional investment in agent development. In this paper, we investigate the creation of an LLM-based embodied agent aimed at interacting with users in real-time to coach them in performing slow and deep breathing. In the approach we followed, the LLM uses a text-based context to generate a composition of predefined behaviors for interacting with the user through both verbal and nonverbal communication. The text-based context provided to the LLM described essential details, like the user’s respiratory rate, to monitor the exercise. Information about actual user’s breathing was provided to the LLM-model through a physiological sensor. The LLM-based breathing coach managed to follow the exercise structure and generated believable contingent behavior compositions. However, as we describe in the paper, building and evaluating the system allowed to highlight limitations of using only LLMs to create agents capable of real-time user interactions. The identified limitations suggest a need for hybrid approaches.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


