Action recognition is a crucial task to provide high-level semantic description of the video content, particularly in the case of sports videos. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it's unable to model temporal information between consecutive frames for video event recognition. In this paper, we present an approach to model actions as a sequence of histograms (one for each frame) represented using a traditional bag-of-words model. Actions are so described by a string (phrase) of variable size, depending on the clip's length, where each frame's representation is considered as a character. To compare these strings we use Needlemann-Wunsch distance, a metrics defined in the information theory, that deal with strings of different length. Finally, SVMs with a string kernel that includes this distance are used to perform classification. Experimental results demonstrate the validity of the proposed approach and they show that it outperforms baseline kNN classifiers.

Action Categorization in Soccer Videos using String Kernels / Lamberto Ballan;Marco Bertini;Alberto Del Bimbo;Giuseppe Serra. - ELETTRONICO. - (2009), pp. 13-18. (Intervento presentato al convegno 7th IEEE International Workshop on Content-Based Multimedia Indexing (CBMI) tenutosi a Chania, Greece nel June 3-5) [10.1109/CBMI.2009.10].

Action Categorization in Soccer Videos using String Kernels

BALLAN, LAMBERTO;BERTINI, MARCO;DEL BIMBO, ALBERTO;SERRA, GIUSEPPE
2009

Abstract

Action recognition is a crucial task to provide high-level semantic description of the video content, particularly in the case of sports videos. The bag-of-words (BoW) approach has proven to be successful for the categorization of objects and scenes in images, but it's unable to model temporal information between consecutive frames for video event recognition. In this paper, we present an approach to model actions as a sequence of histograms (one for each frame) represented using a traditional bag-of-words model. Actions are so described by a string (phrase) of variable size, depending on the clip's length, where each frame's representation is considered as a character. To compare these strings we use Needlemann-Wunsch distance, a metrics defined in the information theory, that deal with strings of different length. Finally, SVMs with a string kernel that includes this distance are used to perform classification. Experimental results demonstrate the validity of the proposed approach and they show that it outperforms baseline kNN classifiers.
2009
Proc. of IEEE International Workshop on Content-Based Multimedia Indexing (CBMI)
7th IEEE International Workshop on Content-Based Multimedia Indexing (CBMI)
Chania, Greece
June 3-5
Lamberto Ballan;Marco Bertini;Alberto Del Bimbo;Giuseppe Serra
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/360333
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 6
social impact