Understanding and measuring children's attention is a key challenge in educational human-robot interaction (HRI). This paper presents a novel evaluation framework for detecting attention in school-age children during a storytelling activity with the NAO social robot. Unlike traditional scenarios where the robot narrates, here the child actively tells a story while the robot listens and adapts, fostering engagement through constructivist and sociocultural learning principles. We integrate multimodal features - including gaze behaviour (automatically labelled using Gaze360 and K-means clustering), task performance, and physiological signals (heart rate) - to classify attention levels via machine learning methods (SVM, KNN, RF). Seventy-four children (aged 7-9) participated in the study, with attention labels validated by expert observers. Results show that combining gaze, task, and physiological features improves classification accuracy (>0.70) compared to unimodal approaches, with KNN achieving the best performance. These findings highlight the potential of multimodal attention detection for enabling adaptive, context-aware robot behaviours in education. Our framework advances the development of socially intelligent robots that can dynamically respond to children's attentional states, ultimately supporting more engaging and effective learning experiences.
Multimodal Attention Evaluation in Child–Robot Storytelling: A Machine Learning Framework with NAO / Fiorini, Laura; Adelucci, Elena; Pugi, Lorenzo; Scatigna, Stefano; Pecini, Chiara; Cavallo, Filippo. - ELETTRONICO. - (2025), pp. 253-260. ( 2025 IEEE International Conference on Advanced Robotics, ICAR 2025 arg 2025) [10.1109/icar65334.2025.11338710].
Multimodal Attention Evaluation in Child–Robot Storytelling: A Machine Learning Framework with NAO
Fiorini, Laura;Adelucci, Elena;Pugi, Lorenzo;Scatigna, Stefano;Pecini, Chiara;Cavallo, Filippo
2025
Abstract
Understanding and measuring children's attention is a key challenge in educational human-robot interaction (HRI). This paper presents a novel evaluation framework for detecting attention in school-age children during a storytelling activity with the NAO social robot. Unlike traditional scenarios where the robot narrates, here the child actively tells a story while the robot listens and adapts, fostering engagement through constructivist and sociocultural learning principles. We integrate multimodal features - including gaze behaviour (automatically labelled using Gaze360 and K-means clustering), task performance, and physiological signals (heart rate) - to classify attention levels via machine learning methods (SVM, KNN, RF). Seventy-four children (aged 7-9) participated in the study, with attention labels validated by expert observers. Results show that combining gaze, task, and physiological features improves classification accuracy (>0.70) compared to unimodal approaches, with KNN achieving the best performance. These findings highlight the potential of multimodal attention detection for enabling adaptive, context-aware robot behaviours in education. Our framework advances the development of socially intelligent robots that can dynamically respond to children's attentional states, ultimately supporting more engaging and effective learning experiences.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



