Under fast viewing conditions, the visual system extracts salient and simplified representations of complex visual scenes. Eye movements optimize such visual analysis through the dynamic sampling of the most informative and salient regions in the scene. However, the definition of saliency, as well as its role in natural active vision, is still a matter of discussion. The present thesis is based on a recent constrained maximum-entropy model of early vision (Del Viva et al., 2013), that deals with the problem of the extraction of biologically relevant information from a large flux of input data in the shortest possible time for survival purposes. According to this model, the visual system produces an early saliency map of a visual scene selecting a limited number of local features, based on criteria of maximal entropy coupled with strict limitations on computational resources. The model, applied to natural images, extracts a set of optimal-information carrier features as candidate “salient” features. The present thesis includes four different studies, which aim to assess the visual saliency of these optimally informative visual features, adding further evidence that confirms the predictions of the reference model. Particularly, we are interested in understanding the role of optimal visual features in the creation of a bottom-up saliency map that the oculomotor system could use to drive eye movements toward potentially relevant locations, therefore, ultimately, in their contribution to image reconstruction. In our experiments, the results obtained with optimal features are compared to those obtained with other features that do not meet the optimality criteria requested by the model, therefore discarded and considered non-optimal. Considering that luminance contrast has a central role in determining saliency in fast vision, we also compare the effects induced by optimal versus non-optimal features to those obtained with features of different luminance. Before describing our experiments, in Chapter 1, the main properties of visual analysis with some notes about eye movements are described. Then the limitations of our visual system end the resulting need for data reduction in fast vision are discussed. Finally, Chapter 1 is mostly dedicated to the presentation of the reference model of early vision, with an extensive discussion of the main computational and behavioral results found in previous works. Whereupon each chapter is dedicated to the presentation of a study. Although all the studies share the same final objective, each one has its own rationale and a specific research question to answer. Chapter 2 presents the literature about the saliency map, and then describes our Study 1 involving perceptual and eye movement tasks. In this study, optimal features were presented in isolation, to investigate whether they are considered visually more salient than other non-optimal features, even in the absence of any meaningful global arrangement and semantic context. In Chapter 3, the topic of cover and overt attention has been summarized, and then Study 2 is presented, in which we implicitly tested the bottom-up saliency driven by optimal features by engaging participants in covert attentional and gaze-orienting cued tasks without explicitly requiring them to pay attention to stimulus saliency. Chapter 4 firstly discusses some saccades' properties and how visual distractors influence their trajectories. Then Study 3 is presented, in which we compared the effects on saccades trajectories produced by optimal vs. non-optimal features used as distractors in a saccadic task, considering the magnitude of curvature as a measure of feature saliency. Finally, Chapter 5 described the problem of occluded objects in real scenes and how our visual system can recognize the whole image based only on little fragmented information. Our Study 4 presented here, explore whether optimal features also play a significant role in more natural settings, investigating the contribution of optimal local information contained in a few visible fragments to image discrimination in fast vision. Chapter 6 is finally dedicated to the discussion of the results obtained in our studies. Overall, the results show that optimal features are considered visually salient, they can automatically attract attention, they interfere with the path of saccades, and they partially contribute to image discrimination. On the other end, non-optimal features do not produce the same effects. These findings suggest that optimally informative local features get preferential treatment during fast image analysis and automatically guide attention and eye movements to create a bottom-up saliency map. Note that, according to the reference model, optimal features represent a compromise between the amount of information they carry about the visual scene and the cost for the system to process them; whereas non-optimal features used in our experiments are individually the most informative, but their use also implies large computational costs. Our findings then suggest that not only the amount of information but also the need of saving computational resources takes a significant role in shaping what the visual system considers to be salient. Very interestingly, all the effects found with optimal features are similar to those obtained with high-luminance features, suggesting that the saliency determined by information maximization criteria produces effects comparable to those due to luminance-based saliency. Let me also mention that, in our studies, we employ some novel paradigms that may be useful tools to test the relative saliency of different stimuli in future research. To conclude, the findings presented in this thesis suggest that visual saliency may be derived naturally in a system that, under the pressure of fast visual analysis, operates maximum information transmission under computational limitation constraints, as predicted by the reference model.

Assessing visual saliency of informative local features with psychophysics and eye movements / Serena Castellotti. - (2023).

Assessing visual saliency of informative local features with psychophysics and eye movements

Serena Castellotti
2023

Abstract

Under fast viewing conditions, the visual system extracts salient and simplified representations of complex visual scenes. Eye movements optimize such visual analysis through the dynamic sampling of the most informative and salient regions in the scene. However, the definition of saliency, as well as its role in natural active vision, is still a matter of discussion. The present thesis is based on a recent constrained maximum-entropy model of early vision (Del Viva et al., 2013), that deals with the problem of the extraction of biologically relevant information from a large flux of input data in the shortest possible time for survival purposes. According to this model, the visual system produces an early saliency map of a visual scene selecting a limited number of local features, based on criteria of maximal entropy coupled with strict limitations on computational resources. The model, applied to natural images, extracts a set of optimal-information carrier features as candidate “salient” features. The present thesis includes four different studies, which aim to assess the visual saliency of these optimally informative visual features, adding further evidence that confirms the predictions of the reference model. Particularly, we are interested in understanding the role of optimal visual features in the creation of a bottom-up saliency map that the oculomotor system could use to drive eye movements toward potentially relevant locations, therefore, ultimately, in their contribution to image reconstruction. In our experiments, the results obtained with optimal features are compared to those obtained with other features that do not meet the optimality criteria requested by the model, therefore discarded and considered non-optimal. Considering that luminance contrast has a central role in determining saliency in fast vision, we also compare the effects induced by optimal versus non-optimal features to those obtained with features of different luminance. Before describing our experiments, in Chapter 1, the main properties of visual analysis with some notes about eye movements are described. Then the limitations of our visual system end the resulting need for data reduction in fast vision are discussed. Finally, Chapter 1 is mostly dedicated to the presentation of the reference model of early vision, with an extensive discussion of the main computational and behavioral results found in previous works. Whereupon each chapter is dedicated to the presentation of a study. Although all the studies share the same final objective, each one has its own rationale and a specific research question to answer. Chapter 2 presents the literature about the saliency map, and then describes our Study 1 involving perceptual and eye movement tasks. In this study, optimal features were presented in isolation, to investigate whether they are considered visually more salient than other non-optimal features, even in the absence of any meaningful global arrangement and semantic context. In Chapter 3, the topic of cover and overt attention has been summarized, and then Study 2 is presented, in which we implicitly tested the bottom-up saliency driven by optimal features by engaging participants in covert attentional and gaze-orienting cued tasks without explicitly requiring them to pay attention to stimulus saliency. Chapter 4 firstly discusses some saccades' properties and how visual distractors influence their trajectories. Then Study 3 is presented, in which we compared the effects on saccades trajectories produced by optimal vs. non-optimal features used as distractors in a saccadic task, considering the magnitude of curvature as a measure of feature saliency. Finally, Chapter 5 described the problem of occluded objects in real scenes and how our visual system can recognize the whole image based only on little fragmented information. Our Study 4 presented here, explore whether optimal features also play a significant role in more natural settings, investigating the contribution of optimal local information contained in a few visible fragments to image discrimination in fast vision. Chapter 6 is finally dedicated to the discussion of the results obtained in our studies. Overall, the results show that optimal features are considered visually salient, they can automatically attract attention, they interfere with the path of saccades, and they partially contribute to image discrimination. On the other end, non-optimal features do not produce the same effects. These findings suggest that optimally informative local features get preferential treatment during fast image analysis and automatically guide attention and eye movements to create a bottom-up saliency map. Note that, according to the reference model, optimal features represent a compromise between the amount of information they carry about the visual scene and the cost for the system to process them; whereas non-optimal features used in our experiments are individually the most informative, but their use also implies large computational costs. Our findings then suggest that not only the amount of information but also the need of saving computational resources takes a significant role in shaping what the visual system considers to be salient. Very interestingly, all the effects found with optimal features are similar to those obtained with high-luminance features, suggesting that the saliency determined by information maximization criteria produces effects comparable to those due to luminance-based saliency. Let me also mention that, in our studies, we employ some novel paradigms that may be useful tools to test the relative saliency of different stimuli in future research. To conclude, the findings presented in this thesis suggest that visual saliency may be derived naturally in a system that, under the pressure of fast visual analysis, operates maximum information transmission under computational limitation constraints, as predicted by the reference model.
2023
Maria Michela Del Viva
ITALIA
Serena Castellotti
File in questo prodotto:
File Dimensione Formato  
Tesi Serena Castellotti.pdf

accesso aperto

Tipologia: Pdf editoriale (Version of record)
Licenza: Open Access
Dimensione 11.26 MB
Formato Adobe PDF
11.26 MB Adobe PDF

I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificatore per citare o creare un link a questa risorsa: https://hdl.handle.net/2158/1321111
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact