|
[back]
Structural
description of basic objects with features
Christoph Rasche
Abstract. We explore the representation of basic-level categories
using computer vision methods. The category representation is expressed
by lines, arcs and a combination thereof. In a bottom-up process we
extract such features, in a top-down process we try to match each category
representation against the bottom-up output.
Recognizing
the structure of an object category bears two principal aspects: the
viewpoint and the variability aspect (Palmer 1999). The viewpoint aspect
deals with the question why we are able to recognize the same object
from different viewpoints. The variability aspect deals with the question
why we can recognize an object despite the structural variability amongst
instances of the same category. Much of recent object recognition research
has focused on the viewpoint aspect (Bergevin, Levine 1993; Tarr, Bulthoff
1998). However, the variability aspect is important too (if not even
more important): because there is structural variability amongst instances
of the same category, it is therefore likely that the category representation
needs to be variable as well. For example, we expect to define relations
and feature measures in approximate ranges. A basic-object recognition
system thus needs to show that it can assign any instance of its category
to the corresponding, variable category structure. In other terms, we
need to show that we can distinguish (several) instances of one category
from other object categories. This has not been shown and tested yet
on grey-scale images to our knowledge.

We express
categories by features like lines, arcs, vertices etc. For example,
a bicycle is represented by two circles of approximate radius separated
by an approximate distance. A banana is represented by two arcs of large
radius. Recognition evolvement is divided into a bottom-up and a top-down
process (figure \ref{rec_ev}). The bottom-up process generates the features,
the top-down process tries to match each category representation against
the bottom-up output and decides whether the category exists in the
object image (similar to Lowe 1987).

21
categories were used, each category containing about 10 images. Six
categories were described (see figure \ref{chair} for an example): their
correct recognition ranges from 50 percent to 100 percent, their false
alarm rate (in all ca. 210 images) is only 6 percent or less.
This
simple recognition system demonstrates that a structural description
of objects with simple features [and not volumetric parts (Marr 81,
Biederman 87)] can account for basic-level category description. We
intend to explore a more detailed representation of basic-level categories
and to achieve a higher correct-recognition rate and a lower false-alarm
rate.
top
|