This paper makes the VISTA database, composed of inertial and visual data, publicly available for gesture and activity recognition. The inertial data were acquired with the SensHand, which can capture the movement of wrist, thumb, index and middle fingers, while the RGB-D visual data were acquired simultaneously from two different points of view, front and side. The VISTA database was acquired in two experimental phases: in the former, the participants have been asked to perform 10 different actions; in the latter, they had to execute five scenes of daily living, which corresponded to a combination of the actions of the selected actions. In both phase, Pepper interacted with participants. The two camera point of views mimic the different point of view of pepper. Overall, the dataset includes 7682 action instances for the training phase and 3361 action instances for the testing phase. It can be seen as a framework for future studies on artificial intelligence techniques for activity recognition, including inertial-only data, visual-only data, or a sensor fusion approach.

The VISTA datasets, a combination of inertial sensors and depth cameras data for activity recognition / Fiorini L.; Cornacchia Loizzo F.G.; Sorrentino A.; Rovini E.; Di Nuovo A.; Cavallo F.. - In: SCIENTIFIC DATA. - ISSN 2052-4463. - ELETTRONICO. - 9:(2022), pp. 0-0. [10.1038/s41597-022-01324-3]

The VISTA datasets, a combination of inertial sensors and depth cameras data for activity recognition

Fiorini L.
;
Sorrentino A.;Rovini E.;Cavallo F.
2022

Abstract

This paper makes the VISTA database, composed of inertial and visual data, publicly available for gesture and activity recognition. The inertial data were acquired with the SensHand, which can capture the movement of wrist, thumb, index and middle fingers, while the RGB-D visual data were acquired simultaneously from two different points of view, front and side. The VISTA database was acquired in two experimental phases: in the former, the participants have been asked to perform 10 different actions; in the latter, they had to execute five scenes of daily living, which corresponded to a combination of the actions of the selected actions. In both phase, Pepper interacted with participants. The two camera point of views mimic the different point of view of pepper. Overall, the dataset includes 7682 action instances for the training phase and 3361 action instances for the testing phase. It can be seen as a framework for future studies on artificial intelligence techniques for activity recognition, including inertial-only data, visual-only data, or a sensor fusion approach.
2022
9
0
0
Fiorini L.; Cornacchia Loizzo F.G.; Sorrentino A.; Rovini E.; Di Nuovo A.; Cavallo F.
File in questo prodotto:
File Dimensione Formato  
2022 - Scientific Data_ VISTA dataset_Fiorini.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 3.37 MB
Formato Adobe PDF
3.37 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1281182
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact