We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation.
Portmanteau vocabularies for multi-cue image representation / Khan, Fahad Shahbaz; Van De Weijer, Joost; Bagdanov, Andrew D.; Vanrell, Maria. - ELETTRONICO. - (2011), pp. 1-9. (Intervento presentato al convegno 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 tenutosi a Granada, esp nel 2011).
Portmanteau vocabularies for multi-cue image representation
BAGDANOV, ANDREW DAVID;
2011
Abstract
We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.