Home

Caltech
Center for Neuromorphic Systems Engineering

Home
Research
News
People

[back]

Part 1/ Rapid Visual Categorization in the Absence of Awareness
Rufin VanRullen

Humans can categorize natural scenes on the basis of the presence of a target object (i.e. animal) so rapidly (150 ms) that such processing has been proposed to rely on the feed-forward propagation of information collected during the first milliseconds of visual stimulation. According to this view, early motor responses should be mostly unaffected by masking the visual stimulus after a few tens of milliseconds. We asked our subjects to respond to masked (SOA 26.6 ms) and unmasked natural scenes when they contained an animal. In addition, subjects rated their confidence in perceiving the contents of each masked image. For a majority of the scenes, masking effectively prevented awareness of the stimuli, as indicated by the fact that confidence ratings did not predict categorization accuracy (Kunimoto et al, 2001). For the same scenes however subjects responded significantly above chance level to the presence of animals. In addition, motor responses started to reflect correct categorizations at the same time for masked and unmasked stimuli, indicating that early responses in "normal" (unmasked) visual categorization probably also rely on the first milliseconds of stimulation. Similar results were obtained with simpler displays for which we could control stimulus and mask contrast. In that case the earliest motor responses to "perceived" and "unperceived" targets showed virtually identical distributions, strongly supporting the feed-forward model: information about the first milliseconds of visual stimulation can propagate throughout the visual system, unaffected by later changes, and determine behavior even when it is not (or not yet) available to consciousness.



Figure 1. Natural scene categorization (animal vs. non-animal) with and without a mask (3 subjects). A. Proportion of trials associated with each possible confidence rating. Zero confidence (more than 50% of the trials) corresponds to a situation where masking the scenes prevents visual awareness, as indicated by the fact that further confidence judgments for these images will fail to predict categorization accuracy. For all images including those associated with zero confidence ratings, subjects can perform the categorization task above chance (p<.0005). B. Distribution of reaction times to masked and unmasked scenes (20 ms time bins). Reaction times to masked targets differ from reaction times to masked distractors at the same time (Ôdiscrimination onsetÕ) as for unmasked stimuli. This suggests that in the general case of unmasked natural scenes, the visual system must also make use of the information extracted during the very first milliseconds of visual stimulation.

Figure 2. By using letters as stimuli we can make sure that the contrast of the mask is comparable to the contrast of the stimulus. Under these conditions a purely feed-forward model predicts that fine temporal differences between stimulus and mask onset should be preserved up to the highest levels of the processing hierarchy, and observed in motor responses as well. A. Subjects are required to respond as fast as possible when the letter P is presented, and whithold responding when the letters R or B are displayed. In congruent trials, letters are flashed for 52ms, while in incongruent and control trials 2 distinct letters are flashed successively for 26 ms (the target followed by a distractor for incongruent trials, a distractor followed by the target for control trials). Under these conditions only the distractor letter is consciously perceived, due to backward- and forward-masking effects occuring respectively in incongruent and control trials. B. As predicted by the feed-forward model, responses to incongruent (masked) target trials follow the distribution of responses to congruent (unmasked) targets for a certain period of time after the discrmination onset. During this period, behavioral responses are only determined by the first 26 ms of stimulation. After this period the masking letter begins to affect responses, but it is only after more than 400 ms that reaction times will reflect the subjectÕs perception of the stimulation.

Part 2/ Processing capacity for natural scenes and objects in the human visual system.
When a visual scene containing many discrete objects is presented to our retinae, only a subset of these objects will be explicitly represented in visual awareness. The number of objects accessing short-term visual memory might be even smaller. Finally, it is not known to what extent "ignored" objects (those that do not enter visual awareness) will be processed Ðor recognized. By combining free recall, forced-choice recognition and visual priming paradigms for the same natural visual scenes and subjects, we were able to estimate these numbers, and provide insights as to the fate of objects that are not explicitly recognized in a single fixation. When presented for 250 ms with a scene containing 10 distinct objects, human observers can remember up to 4 objects with full confidence. If forced to guess a further number of objects, they can reliably report between 2 and 3 more objects above chance level. These numbers depend on various factors such as target objects size, eccentricity or familiarity. Finally, even the objects that the subjects consistently failed to report elicited a significant negative priming effect when presented in a subsequent task, suggesting that their identity was represented in high-level cortical areas of the visual system, before the corresponding neural activity was suppressed during attentional selection.



top