In this paper we present a novel method to improve action recognition by leveraging a set of captioned videos. By learning linear projections to map videos and text onto a common space, our approach shows that improved results on unseen videos can be obtained. We also propose a novel structure preserving loss that further ameliorates the quality of the projections. We tested our method on the challenging, realistic, Hollywood2 action recognition dataset where a considerable gain in performance is obtained. We show that the gain is proportional to the number of training samples used to learn the projections.

Do textual descriptions help action recognition? / Bruni, Matteo; Uricchio, Tiberio; Seidenari, Lorenzo; Del Bimbo, Alberto. - ELETTRONICO. - (2016), pp. 645-649. (Intervento presentato al convegno ACM Multimedia tenutosi a gbr nel 2016) [10.1145/2964284.2967301].

Do textual descriptions help action recognition?

BRUNI, MATTEO;URICCHIO, TIBERIO;SEIDENARI, LORENZO;DEL BIMBO, ALBERTO
2016

Abstract

In this paper we present a novel method to improve action recognition by leveraging a set of captioned videos. By learning linear projections to map videos and text onto a common space, our approach shows that improved results on unseen videos can be obtained. We also propose a novel structure preserving loss that further ameliorates the quality of the projections. We tested our method on the challenging, realistic, Hollywood2 action recognition dataset where a considerable gain in performance is obtained. We show that the gain is proportional to the number of training samples used to learn the projections.
2016
MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
ACM Multimedia
gbr
2016
Bruni, Matteo; Uricchio, Tiberio; Seidenari, Lorenzo; Del Bimbo, Alberto
File in questo prodotto:
File Dimensione Formato  
p645-bruni.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: DRM non definito
Dimensione 798.62 kB
Formato Adobe PDF
798.62 kB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1065602
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact