Reverse Engineering Visual Intelligence


The brain and cognitive sciences are hard at work on a great scientific quest —to reverse engineer the human mind and its intelligent behavior. Yet these field are still in their infancy. Not surprisingly, forward engineering approaches that aim to emulate human intelligence (HI)in artificial systems (AI) are also still in their infancy. Yet the intelligence and cognitive flexibility apparent in human behavior are an existence proof that machines can be constructed to emulate and work alongside the human mind. I believe that these challenges of reverse engineering human intelligence will be solved by tightly combining the efforts of brain and cognitive scientists (hypothesis generation and data acquisition), and forward engineering aiming to emulate intelligent behavior (hypothesis instantiation and data prediction). As this approach discovers the correct neural network models, those models will not only encapsulate our understanding of complex brain systems, they will be the basis of next-generation computing and novel brain interfaces for therapeutic and augmentation goals (e.g, brain disorders).

In this session, I will focus on one aspect of human intelligence —visual object categorization and detection —and I will tell the story of how work in brain science, cognitive science and computer science converged to create deep neural networks that can support such tasks. These networks not only reach human performance for many images, but their internal workings are modeled after—and largely explain and predict —the internal workings of the primate visual system. Yet, the primate visual system (HI) still outperforms current generation artificial deep neural networks (AI), and I will show some new clues that the brain and cognitive sciences can offer. These recent successes and related work suggest that the brain and cognitive sciences community is poised to embrace a powerful new research paradigm. More broadly, our species is the beginning of its most important science quest —the quest to understand human intelligence —and I hope to motivate others to engage that frontier alongside us.


James DiCarlo is a Professor of Neuroscience, and Faculty of the Department of Brain and Cognitive Sciences at the Massachusetts Institute of Technology. His research goal is to reverse engineer the brain mechanisms that underlie human visual intelligence. He and his collaborators have revealed how population image transformations carried out by a deep stack of neocortical processing stages --called the primate ventral visual stream --are effortlessly able to extract object identity from visual images. His team uses a combination of large-scale neurophysiology, brain imaging, direct neural perturbation methods, and machine learning methods to build and test artificial neural network models of the ventral visual stream and its support of cognition and behavior. Such an engineering-based understanding is likely to lead to new artificial vision and artificial intelligence approaches, new brain-machine interfaces to restore or augment lost senses, and a new foundation to ameliorate disorders of the mind