EAGER grant to program computers to recognize, classify images

09/01/2011

As the volume of digital information grows, there’s an increasing need to automate the ability to understand and categorize the information as quickly as possible. Engineers in the Coordinated Science Laboratory are addressing the problem by teaching machines how to recognize and classify objects as seamlessly as a human does.

Narendra Ahuja
Narendra Ahuja

Narendra Ahuja has received a one-year, $270,000 grant, “EAGER: Automated High Speed Object Category Modeling and Model Based Recognition, Segmentation, Clustering, and Classification,” from the National Science Foundation to improve high-performance category modeling.

“Humans are able to distinguish the difference between a cat and a tree, and also understand the context – such as where the cat is sitting – in the blink of an eye,” said Ahuja, who is a Donald Biggar Willet Professor of Electrical and Computer Engineering and researcher in the Beckman Institute. “We want to build this ability to understand and recognize objects in machines, but it’s a very complicated process.”

Researchers seek to deconstruct the problem by focusing on three primary areas: recognition, segmentation and clustering.

Recognition work will focus on creating robust algorithms that enable machines to recognize stationary objects, objects in motion and multiple objects interacting, even amidst structural “noise” that causes changes in the ways the objects appear. For example, a computer must not only be able to tell the difference between a tree and a cat, but also between a cat and dog, which are much more similar. In addition, researchers are working to program machines to recognize objects when they are moving, such as a dog chasing a cat.

“Movement actually contributes a wealth of information to the recognition process,” Ahuja said. “You can make a lot of inferences about an object by the way it moves. For example, a bird sitting in a tree is hard to notice until it starts to move.”

Researchers will also advance segmentation, which enables machines to make judgments about the location of an object within an image and to delineate the object from its background. This work is particularly important in surveillance applications, where it may be helpful to not only identify that an object may be a threat but also pinpoint its location.

Finally, Ahuja’s team is seeking breakthroughs in clustering. Clustering extends recognition work beyond identifying an object to building classes of objects, or taxonomies. Researchers will use a variety of criteria – color, shape, size, etc. – to group objects into hierarchies. In addition, the team will work to apply the method to the extraction of texture elements, a challenging problem that, if integrated, would prove a valuable tool in extending the reach of recognition algorithms.

The research could yield numerous applications related to digital search engines, surveillance, video analytics, monitoring and data mining.

“The key is to do these tasks at very high speeds without compromising the quality,” Ahuja said. “If you do this linearly, trying out one object at a time, it would take forever. By creating an approach that uses taxonomies and other knowledge about inter-object relationships for high-speed computing, we hope to develop solutions that are not only smart but also are practically usable.”