
Speaker
Sathyanarayanan (Sathya) Aakur
Abstract
Deep learning models for multimodal understanding have taken great strides in task such as event recognition, segmentation and localization. However, there appears to be an implicit closed world assumption in these approaches, i.e., they assume that all observed data is composed of a static, known set of objects (nouns), actions (verbs), and activities (noun+verb combination) that are in 1:1 correspondence with the vocabulary from the training data. One must account for every eventuality when training these systems to ensure their performance in real-world environments. In this talk, I will present our recent efforts to build open world understanding models that leverage the general-purpose knowledge embedded in large scale knowledge bases for providing supervision using a neuro symbolic framework based on Grenander’s Pattern Theory formalism. Then, I will talk about how this framework can be extended to abductive reasoning for commonsense natural language inference, in addition to commonsense reasoning for visual understanding. Finally, I will briefly present some results from the bottom-up, neural side of open world event perception that helps navigate clutter and provides cues for the abductive reasoning frameworks.
Bio
Sathyanarayanan (Sathya) Aakur is currently an Assistant Professor with the Department of Computer Science at Oklahoma State University since 2019. He received his PhD degree in Computer Science and Engineering from the University of South Florida, Tampa, in 2019. His research interests include multimodal event understanding, commonsense reasoning for open world visual understanding, and deep learning applications for genomics. He has published works in prestigious venues such as CVPR, ECCV, and MICCAI to name a few. He is the recipient of the National Science Foundation CAREER award in 2022.