In this paper we present an efficient and accurate method to aggregate a set of Deep Convolutional Neural Network (CNN) responses, extracted from a set of image windows. CNN features are usually computed on the whole frame or with a dense multi scale approach. There is evidence that using multiple windows yields a better image representation nonetheless it is still not clear how windows should be sam- pled and how CNN responses should be aggregated. Instead of sampling the image densely in scale and space we show that selecting a few hundred windows is enough to obtain an effective image signature. We show how to use Fisher Vectors and PCA to obtain a short and highly descriptive signature that can be used effectively for image retrieval. We test our method on two relevant computer vision tasks: image retrieval and image tagging. We report state-of-the art results for both tasks on three standard datasets.

Fisher Encoded Convolutional Bag-of-Windows for Efficient Image Retrieval and Social Image Tagging / Uricchio, Tiberio; Bertini, Marco; Seidenari, Lorenzo; Del Bimbo, Alberto. - ELETTRONICO. - (2015), pp. 9-15. (Intervento presentato al convegno International Conference on Computer Vision Workshops 2015) [10.1109/ICCVW.2015.134].

Fisher Encoded Convolutional Bag-of-Windows for Efficient Image Retrieval and Social Image Tagging

URICCHIO, TIBERIO;BERTINI, MARCO;SEIDENARI, LORENZO;DEL BIMBO, ALBERTO
2015

Abstract

In this paper we present an efficient and accurate method to aggregate a set of Deep Convolutional Neural Network (CNN) responses, extracted from a set of image windows. CNN features are usually computed on the whole frame or with a dense multi scale approach. There is evidence that using multiple windows yields a better image representation nonetheless it is still not clear how windows should be sam- pled and how CNN responses should be aggregated. Instead of sampling the image densely in scale and space we show that selecting a few hundred windows is enough to obtain an effective image signature. We show how to use Fisher Vectors and PCA to obtain a short and highly descriptive signature that can be used effectively for image retrieval. We test our method on two relevant computer vision tasks: image retrieval and image tagging. We report state-of-the art results for both tasks on three standard datasets.
2015
Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015
International Conference on Computer Vision Workshops 2015
Uricchio, Tiberio; Bertini, Marco; Seidenari, Lorenzo; Del Bimbo, Alberto
File in questo prodotto:
File Dimensione Formato  
vsm_cnn_fv.pdf

Accesso chiuso

Tipologia: Pdf editoriale (Version of record)
Licenza: Tutti i diritti riservati
Dimensione 2.82 MB
Formato Adobe PDF
2.82 MB Adobe PDF   Richiedi una copia

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1052774
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 16
social impact