Tagging of multimedia content is becoming more and more widespread as web 2.0 sites, like Flickr and Facebook for images, YouTube and Vimeo for videos, have popularized tagging functionalities among their users. These user-generated tags are used to retrieve multimedia content, and to ease browsing and exploration of media collections, e.g. using tag clouds. However, not all media are equally tagged by users: using the current browsers is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook; on the other hand tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a system for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to shots. This approach exploits collective knowledge embedded in tags and Wikipedia, and visual similarity of key frames and images uploaded to social sites like YouTube and Flickr.

Enriching and Localizing Semantic Tags in Internet Videos / Lamberto Ballan; Marco Bertini; Alberto Del Bimbo; Giuseppe Serra. - ELETTRONICO. - (2011), pp. 1541-1544. (Intervento presentato al convegno ACM Multimedia tenutosi a Scottsdale, AZ, USA nel November 28-December 1) [10.1145/2072298.2072060].

Enriching and Localizing Semantic Tags in Internet Videos

BALLAN, LAMBERTO;BERTINI, MARCO;DEL BIMBO, ALBERTO;SERRA, GIUSEPPE
2011

Abstract

Tagging of multimedia content is becoming more and more widespread as web 2.0 sites, like Flickr and Facebook for images, YouTube and Vimeo for videos, have popularized tagging functionalities among their users. These user-generated tags are used to retrieve multimedia content, and to ease browsing and exploration of media collections, e.g. using tag clouds. However, not all media are equally tagged by users: using the current browsers is easy to tag a single photo, and even tagging a part of a photo, like a face, has become common in sites like Flickr and Facebook; on the other hand tagging a video sequence is more complicated and time consuming, so that users just tag the overall content of a video. In this paper we present a system for automatic video annotation that increases the number of tags originally provided by users, and localizes them temporally, associating tags to shots. This approach exploits collective knowledge embedded in tags and Wikipedia, and visual similarity of key frames and images uploaded to social sites like YouTube and Flickr.
2011
Proc. of ACM Multimedia
ACM Multimedia
Scottsdale, AZ, USA
November 28-December 1
Lamberto Ballan; Marco Bertini; Alberto Del Bimbo; Giuseppe Serra
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/457256
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? ND
social impact