|
[back]
Rapid
Natural Scene Categorization without Attention
Fei Fei Li, Rufin VanRullen, Christof Koch, Pietro Perona
Visual
attention plays an important role as we walk around the world and recognizes
different objects. So what happens when attention is taken away? Are
we still able to recognize scenes or objects? Our study finds that certain
high level tasks, such as natural scene categorization, can still be
performed with little or no attention.
Abstract. What can we see when we do not pay attention? While attention
is not necessary for some detection tasks on simple synthetic stimuli,
without it we are "blind" even to major aspects of a natural complex
scene. It would thus appear that only visual tasks that have an explanation
in the early stages of the visual system may be carried out without
attention. We report on a complex visual task that requires no attention.
Our subjects can rapidly detect animals in briefly presented natural
scenes while simultaneously performing another visual task that demands
full attention. By comparison, they are unable to discriminate large
'T's from 'L's in the same conditions. We conclude that attention may
not be necessary for some visual tasks that are associated with 'high
level' cortical areas.
Motivation. Psychologists have long known that certain visual search
tasks do not require attention. A hallmark of inattentive visual is
that it is achieved in a parallel fashion: an inattentive task may be
carried out simultaneously with other visual tasks; target detection
does not become more difficult when the number of distractors is increased.
For example, the following three panels are three examples of inattentive
tasks taken from Braun and colleagues' study in 1998.
However, none of the known inattentive tasks approaches the sophistication
of everyday vision where complex scenes must be scrutinized in order
to assess high level properties such as presentation of danger or the
structure of the social interaction. On the other hand, previous studies
have suggest a superbly rapid processing of complex visual scenes by
our visual system (Thorpe et al, 1996). In general, humans can decisively
differentiate a natural scene containing target object(s) (such as animal(s))
from a scene that does not contain target object in as fast as 150msec.
We are hence motivated by these two lines of studies to explore whether
attention plays a critical role in such rapid natural scene categorizations.
Main Results. We conducted psychophysical experiment on human
subjects. The following figure is a schematic illustration of one trial
of the main experiment, along with samples of target and distractor
images. An attentionally demanding letter discrimination task is presented
at the center of the visual field. After the central SOA time, the central
stimulus (combination of Ts and Ls) is replaced by a perceptual mask
(five Fs). Subjects are instructed to respond whether all five letters
are the same or one of them is different. In the peripheral natural
scene categorization task, an image is presented peripherally for a
very brief time (27msec). Subjects make a speeded response to the presence
of animals. Under the dual task condition, subjects are required to
perform both tasks concurrently.

As illustrated
by the following figure, our results show that there is little or no
attentional cost for subjects to perform the natural scene categorization
task when attention is withdrawn by the central letter discrimination
task. Here we show a normalized average performance by all subjects,
each represented by one dot from each color. Red dot indicates performance
collected from trained images. Blue dot indicates performance collected
from completely novel images. The horizontal axis represents performance
of the central task (attentionally demanding). The vertical axis represents
performance of the peripheral task (natural scene categorization). For
each subject, his/her single task performance on both the peripheral
task and the central task are independently normalized to 100%. Our
goal is to compare the dual task performances, represented by red and
blue dots. Clustering of the dots at the (100,100) corner indicates
that subjects can simultaneously perform both the central letter task
as well as the peripheral categorization task.

Conclusion.
We
reported a study in which withdrawing attention entails little or no
cost to the performance of a complex visual task, namely the natural
scene categorization task. We also conducted a series of control experiments
that compared this result with performing seemingly simpler, synthetic
stimuli visual tasks. Contrary to our intuition, natural scene categorization
appears to entail the least attentional demand. These results lend great
motivation to future studies of attentional as well as recognition models
of the human visual system.
top
|