More and more popular, Smart TVs and set-top boxes open new ways for richer experiences in our living rooms. But to offer richer and novel functionalities, a better understanding of the multimedia content is crucial. If many works try to automatically annotate videos at object level, or classify them, we think that investigating the emotions will allow great TV experience improvements. With our work, we propose a temporal saliency detection approach capable of defining the most exciting parts of a video that will be of the most interest to the users. To identify the most interesting events, without performing their classification (in order to be independent from the video domain), we compute a time series of arousal (excitement level of the content), based on audio-visual features. Our goal is to merge this preliminary work with user emotions analysis, in order to create a multi-modal system, allowing to bridge the gap between users’ needs and multimedia contents.

Towards temporal saliency detection: better video understanding for richer TV experiences / Dumoulin, Joël; Mugellini, Elena; Abou Khaled, Omar; Bertini, Marco; Del Bimbo, Alberto. - ELETTRONICO. - (2014), pp. 199-202. (Intervento presentato al convegno ICDS 2014, The Eighth International Conference on Digital Society).

Towards temporal saliency detection: better video understanding for richer TV experiences

DUMOULIN, JOEL;BERTINI, MARCO;DEL BIMBO, ALBERTO
2014

Abstract

More and more popular, Smart TVs and set-top boxes open new ways for richer experiences in our living rooms. But to offer richer and novel functionalities, a better understanding of the multimedia content is crucial. If many works try to automatically annotate videos at object level, or classify them, we think that investigating the emotions will allow great TV experience improvements. With our work, we propose a temporal saliency detection approach capable of defining the most exciting parts of a video that will be of the most interest to the users. To identify the most interesting events, without performing their classification (in order to be independent from the video domain), we compute a time series of arousal (excitement level of the content), based on audio-visual features. Our goal is to merge this preliminary work with user emotions analysis, in order to create a multi-modal system, allowing to bridge the gap between users’ needs and multimedia contents.
2014
ICDS 2014, The Eighth International Conference on Digital Society
ICDS 2014, The Eighth International Conference on Digital Society
Dumoulin, Joël; Mugellini, Elena; Abou Khaled, Omar; Bertini, Marco; Del Bimbo, Alberto
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1008862
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact