We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation.

Portmanteau vocabularies for multi-cue image representation / Khan, Fahad Shahbaz; Van De Weijer, Joost; Bagdanov, Andrew D.; Vanrell, Maria. - ELETTRONICO. - (2011), pp. 1-9. (Intervento presentato al convegno 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 tenutosi a Granada, esp nel 2011).

Portmanteau vocabularies for multi-cue image representation

BAGDANOV, ANDREW DAVID;
2011

Abstract

We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation.
2011
Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011
Granada, esp
2011
Khan, Fahad Shahbaz; Van De Weijer, Joost; Bagdanov, Andrew D.; Vanrell, Maria
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1014809
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 53
  • ???jsp.display-item.citation.isi??? ND
social impact