Home

Caltech
Center for Neuromorphic Systems Engineering

Home
Research
News
People

[back]

Structural description of basic objects with features
Christoph Rasche


Abstract. We explore the representation of basic-level categories using computer vision methods. The category representation is expressed by lines, arcs and a combination thereof. In a bottom-up process we extract such features, in a top-down process we try to match each category representation against the bottom-up output.

Recognizing the structure of an object category bears two principal aspects: the viewpoint and the variability aspect (Palmer 1999). The viewpoint aspect deals with the question why we are able to recognize the same object from different viewpoints. The variability aspect deals with the question why we can recognize an object despite the structural variability amongst instances of the same category. Much of recent object recognition research has focused on the viewpoint aspect (Bergevin, Levine 1993; Tarr, Bulthoff 1998). However, the variability aspect is important too (if not even more important): because there is structural variability amongst instances of the same category, it is therefore likely that the category representation needs to be variable as well. For example, we expect to define relations and feature measures in approximate ranges. A basic-object recognition system thus needs to show that it can assign any instance of its category to the corresponding, variable category structure. In other terms, we need to show that we can distinguish (several) instances of one category from other object categories. This has not been shown and tested yet on grey-scale images to our knowledge.


We express categories by features like lines, arcs, vertices etc. For example, a bicycle is represented by two circles of approximate radius separated by an approximate distance. A banana is represented by two arcs of large radius. Recognition evolvement is divided into a bottom-up and a top-down process (figure \ref{rec_ev}). The bottom-up process generates the features, the top-down process tries to match each category representation against the bottom-up output and decides whether the category exists in the object image (similar to Lowe 1987).

21 categories were used, each category containing about 10 images. Six categories were described (see figure \ref{chair} for an example): their correct recognition ranges from 50 percent to 100 percent, their false alarm rate (in all ca. 210 images) is only 6 percent or less.

This simple recognition system demonstrates that a structural description of objects with simple features [and not volumetric parts (Marr 81, Biederman 87)] can account for basic-level category description. We intend to explore a more detailed representation of basic-level categories and to achieve a higher correct-recognition rate and a lower false-alarm rate.


top