IMEmo: An Interpersonal Relation Multi-Emotion Dataset

Guerdelli, H.; Ferrari, C.; Berretti, S.; Del Bimbo, A.

doi:10.1109/FG59268.2024.10581895

While engaged in a face-to-face conversation, being capable of understanding the attitude, emotion, and intention of another person allows one guiding his/her behavior establishing a comfortable communication either verbal and non-verbal (i.e., body and face language). This paper introduces the "The IMEmo Interpersonal Multi-Emotion video dataset", a new in-the-wild dataset of face-to-face interaction, built from movies of romance and drama categories. We manually collected over 100 clips from different movies in different languages. The dataset consists of 79.3 minutes scenes, with a duration of each clip ranging between 0.20 and 2.13 minutes. Each clip contains two people communicating with each other both verbally and with expression and body pose and gestures. Currently, it includes age, gender, emotions, social relationships, actions and valence/arousal annotations for both individuals. Emotion recognition results using a baseline CNN approach are also reported to provide an estimation of the difficulty of the data also in comparison to existing benchmarks.

IMEmo: An Interpersonal Relation Multi-Emotion Dataset / Guerdelli H.; Ferrari C.; Berretti S.; Del Bimbo A.. - ELETTRONICO. - 12:(2024), pp. 1-10. (Intervento presentato al convegno IEEE 18th International Conference on Automatic Face and Gesture Recognition tenutosi a Istanbul, Turkye nel 27-31 May, 2024) [10.1109/FG59268.2024.10581895].