Home

Caltech
Center for Neuromorphic Systems Engineering

Home
Research
News
People

[back]

Models of visual object categorization in humans
Robert J. Peters, Fabrizio Gabbiani, Christof Koch

Abstract. Previous studies of exemplar, prototype, and decision-bound models of visual object categorization have not resolved the importance of memory capacity and flexibility of decision surfaces in human categorization behavior. We have compared these previous models with our new roaming exemplar model (RXM), according to their abilities to match human observers' categorizations of various 2-D image contours. Unlike past comparisons among categorization models, we explicitly accounted for memory capacity by penalizing models for their number of free parameters with the Akaike information criterion. This revealed that a successful model of human categorization--such as the RXM--did not require a large memory capacity if the orientation of its decision boundary was unconstrained, suggesting that an efficient computer implementation of object categorization could also rely on limited memory storage.

Motivation. Object categorization is one of the primary tasks of the human visual system. Successful categorization of visual stimuli is a result of sensory processing and prior visual experience that is used for conscious cognition. Psychological models of categorization typically make categorization decisions using a mechansim based on a multidimensional representation of incoming stimuli, plus possible auxiliary representations, such as memory traces. This process is controlled by a number of free parameters, which are fitted with the goal of matching human categorization behavior. However, a simple statistical comparison between models may ignore important differences in the neurobiological implications of the models. For example, one highly successful model, the generalized context model (GCM), assumes that all training images are stored in memory; a literal interpretation of the GCM might conclude that the neuronal substrate of categorization also scales linearly with the number of exemplars in a category, or that categorization in biological systems involves only brute-force memorization, without any category-level abstraction. To provide a more detailed look at such issues, we have developed a new roaming exemplar model (RXM) that draws from neural networks and exemplar-based models of categorization. In contrast to previous exemplar-based models, the RXM's memory traces are free parameters, allowing us to control for memory capacity when judging a models' goodness-of-fit. Thus, using human categorization performance as goal, we compare several computational models of categorization, providing new insights regarding the key qualities of a successful categorization model.

Research


We used three types of schematic, line-drawn visual stimuli (see figure above): Brunswik faces and tropical fish outlines, which have been used previously, plus a new set of "cartoon face" images. Each type of visual object was parameterized along four dimensions comprising the stimulus parameter space. Different groups of objects were assigned to configurations, which contained equal numbers of training exemplars assigned to each of two categories, as well as an additional number of testing exemplars. The training exemplars from the two categories were always chosen so as to be linearly separable in the objects' parameter space; that is, the members of the two categories could be separated by some 3-D hyperplane in the 4-D parameter space. The categorization experiments consisted of a training phase and a testing phase. In both phases, subjects viewed a series of objects presented one at a time. Each object was presented for 2s, followed by 2s of blank screen. During each 4s trial, subjects pressed one of two buttons indicating to which category the object belonged. In the training phase, subjects were shown only the training exemplars from the two categories of objects, and were given feedback on their reponses in the form of a high- or low-pitch tone indicating whether the response was correct or incorrect, respectively. Subjects performed training blocks of 100 trials until they scored at least 85% correct on a training block. Once subjects reached this criterion, they moved into the testing phase, in which they were shown the previously unseen testing exemplars, in addition to the training exemplars that they had viewed during the training phase. Subjects received no feedback on their responses during the testing phase.


We tested several categorization models (see figure above) by fitting them to match the human observers' response profiles from the testing phase of the categorization tasks. Each model receives input in the 4-D stimulus parameter space, and produces an output that represents a categorization probability for the input object. The models we tested can be summarized as follows:

Exemplar models compute the distance in feature space between a test exemplar and each of a set of stored exemplars. The test exemplar is classified into the category for which the sum of these distances is smallest. Different types of exemplar models have different ways of choosing the stored exemplars:

All-exemplar model, in which the set of stored exemplars is identical to the set of training exemplars; this model has the highest possible memory demand.

Prototype model, in which the one stored exemplar per category is the arithmetic mean of the training exemplars from that category; this model has a low and constant memory demand.

Roaming-exemplar model[n], in which each category has n stored exemplars, which must lie within the polygon that circumscribes the training exemplars (dotted lines). The number of stored parameters can be chosen to control the memory demand of the model.

Boundary models learn a linear or quadratic boundary that separates the categories in feature space, and then classifies new objects according to their distance from this boundary.


We fitted subjects' categorization probabilities with versions of the roaming-exemplar model using 1, 2, 3, 6, and 10 stored exemplars, as well as the all-exemplar, prototype, and linear boundary models, and assessed these fits with two measures (see figure above):

1. The loglikelihood, which represents the overall fitting error but is not corrected for the number of free parameters, and
2. The Akaike information criterion (AIC), which includes a penalty for the number of free parameters, thereby allowing unbiased comparisons among models with different numbers of free parameters.

When the model fits were assessed with the loglikelihood (above left), we found that the all-exemplar and boundary models both obtained better (lower) scores than the prototype model. All of the roaming-exemplar models obtained better scores than the all-exemplar, boundary, and prototype models. In addition, there were large improvements in the fit of the RXM[n] as the number of stored exemplars increased.

In contrast, when the model fits were assessed with the AIC to account for their number of free parameters (above right), the RXM with one stored exemplar (RXM[1]) obtained a better (lower) score than all other models, including all-exemplar models, prototype models, boundary models, and versions of the RXM with more than one stored exemplar. Moreover, increasing the number of stored exemplars in the RXM[n] was detrimental to the AIC goodness of fit, so that the RXM[6] and RXM[10] fit much worse than any of the other models. A detailed analysis showed that the RXM[1], despite its low memory capacity, was able to outperform the other models because it had better flexibility in the shape and orientation of its decision surfaces.

In contrast, when the model fits were assessed with the AIC to account for their number of free parameters (above right), the RXM with one stored exemplar (RXM[1]) obtained a better (lower) score than all other models, including all-exemplar models, prototype models, boundary models, and versions of the RXM with more than one stored exemplar. Moreover, increasing the number of stored exemplars in the RXM[n] was detrimental to the AIC goodness of fit, so that the RXM[6] and RXM[10] fit much worse than any of the other models. A detailed analysis showed that the RXM[1], despite its low memory capacity, was able to outperform the other models because it had better flexibility in the shape and orientation of its decision surfaces.


In the RXM, the parameters which describe the stored exemplars become free parameters of the model, and can be incorporated into comparisons among models using statistical measures such as the Akaike Information Criterion. This allowed us to address the importance of memory by comparing different versions of the RXM with different numbers of stored exemplars. With this framework, we can now provide a better answer as to why models which are otherwise appealing in their conceptual simplicity, such as prototype models, are consistently outperformed by all-exemplar models: all-exemplar models allow better flexibility in matching the shape and orientation of decision surfaces to those used by human observers (see figure above). Our results show that the goodness-of-fit of all-exemplar models can even be improved by allowing "roaming" stored exemplars, and thus an unconstrained decision boundary, without committing to potentially unreasonable memory demands or to a lack of category-level abstraction. This is an important step toward the goal of developing models of object recognition that can perform as well as human observers, yet also be implemeted in a computationally efficient manner--we have shown that such an implementation does not need an exorbitant memory capacity as long as it has sufficient flexibility in its learning algorithm.



top