Prof. Amit Sheth
Founding Director, Artificial Intelligence Institute and Professor at University of South Carolina
Title: From NLP to NLU: Why we need varied, comprehensive, and stratified knowledge, and how to use it for Neuro-symbolic AI
Abstract: “Data alone is not enough.” This was the section heading in Pedro Dominguez’s 2012 seminal paper. I have been a believer in this for a long time. In our Semantic Search engine commercialized in 2000, also described in a patent, we complemented machine learning classifiers with a comprehensive WorldModel™ or knowledge bases (now referred to as knowledge graphs) for improved named entity and relationship extraction and semantic search. It was an early demonstration of the complementary nature of data-driven statistical learning (since replaced by neural networks) and knowledge-supported symbolic AI methods. In this talk, I want to observe three important issues corresponding to the Why, What, and How of using knowledge in neuro-symbolic AI systems. While the transformer-based models have achieved tremendous success in many NLP tasks, the purse data-driven approach comes up short when we need NLU, where knowledge is key to understanding the language, as required for the explanation, safety, and supporting decision-making processes that must be followed (e.g., in clinical diagnosis).
I will share the following observations regarding the indispensable role of knowledge.
(a) WHY: while data-driven AI has done reasonably well for activities requiring focused and narrow-intellect activity that does not require a deeper understanding of content, whether natural language or other modalities such as classification, prediction, translation, and recommendation, for activities that rely on a deeper understanding of content and higher-intelligences, such as abstraction and analogy, and for activities that involve decision making and taking actions, it is necessary to involve knowledge humans have gathered and codified.
(b) WHAT: the knowledge needed to support more demanding activities is multifaceted and comprehensive; it needs to cover multiple levels of abstraction. For example, humans utilize all these types of knowledge to understand a natural language: lexical, linguistic, common sense including a sense of time, geographic and sense of location, broad-based or world knowledge, domain/subject/task-specific knowledge, and more. Each of these types of knowledge is distinct in the way it is created, often through collective intelligence or collaborative activities, subject to processes that endow it with the quality needed for given tasks and require appropriate representational richness to capture the semantics needed.
(c) HOW: Several ways to use knowledge to develop neuro-symbolic AI techniques have been proposed. I will describe knowledge-infusion strategies, ranging from shallow infusion using embedding techniques that sacrifice rich semantics in knowledge representation for limited gain, followed by semi-deep and deep knowledge-infusion techniques that retain semantics (such as those captured in expressive knowledge representation and model) to enhance the transformer models. The ultimate aim is to develop neuro-symbolic methods to address the limitations of large language models and to serve the needs that data-driven techniques fail to support. Further details: Advancing Neuro-symbolic AI with Deep Knowledge-infused Learning
Bio: Prof. Amit Sheth (Home Page, LinkedIn) is an Educator, Researcher, and Entrepreneur. He is the founding director of the university-wide AI Institute at the University of South Carolina. He is a Fellow of IEEE, AAAI, AAAS and ACM. He has (co-)founded four companies, including the first Semantic Search company in 1999 that pioneered technology similar to what is found today in Google Semantic Search and Knowledge Graph, ezDI which developed knowledge-infused clinical NLP/NLU, and Cognovi Labs at the intersection of emotion and AI. He is particularly proud of the success of his >45 Ph.D. advisees and postdocs in academia, industry research, and entrepreneurs.
Abstract: “Data alone is not enough.” This was the section heading in Pedro Dominguez’s 2012 seminal paper. I have been a believer in this for a long time. In our Semantic Search engine commercialized in 2000, also described in a patent, we complemented machine learning classifiers with a comprehensive WorldModel™ or knowledge bases (now referred to as knowledge graphs) for improved named entity and relationship extraction and semantic search. It was an early demonstration of the complementary nature of data-driven statistical learning (since replaced by neural networks) and knowledge-supported symbolic AI methods. In this talk, I want to observe three important issues corresponding to the Why, What, and How of using knowledge in neuro-symbolic AI systems. While the transformer-based models have achieved tremendous success in many NLP tasks, the purse data-driven approach comes up short when we need NLU, where knowledge is key to understanding the language, as required for the explanation, safety, and supporting decision-making processes that must be followed (e.g., in clinical diagnosis).
I will share the following observations regarding the indispensable role of knowledge.
(a) WHY: while data-driven AI has done reasonably well for activities requiring focused and narrow-intellect activity that does not require a deeper understanding of content, whether natural language or other modalities such as classification, prediction, translation, and recommendation, for activities that rely on a deeper understanding of content and higher-intelligences, such as abstraction and analogy, and for activities that involve decision making and taking actions, it is necessary to involve knowledge humans have gathered and codified.
(b) WHAT: the knowledge needed to support more demanding activities is multifaceted and comprehensive; it needs to cover multiple levels of abstraction. For example, humans utilize all these types of knowledge to understand a natural language: lexical, linguistic, common sense including a sense of time, geographic and sense of location, broad-based or world knowledge, domain/subject/task-specific knowledge, and more. Each of these types of knowledge is distinct in the way it is created, often through collective intelligence or collaborative activities, subject to processes that endow it with the quality needed for given tasks and require appropriate representational richness to capture the semantics needed.
(c) HOW: Several ways to use knowledge to develop neuro-symbolic AI techniques have been proposed. I will describe knowledge-infusion strategies, ranging from shallow infusion using embedding techniques that sacrifice rich semantics in knowledge representation for limited gain, followed by semi-deep and deep knowledge-infusion techniques that retain semantics (such as those captured in expressive knowledge representation and model) to enhance the transformer models. The ultimate aim is to develop neuro-symbolic methods to address the limitations of large language models and to serve the needs that data-driven techniques fail to support. Further details: Advancing Neuro-symbolic AI with Deep Knowledge-infused Learning
Bio: Prof. Amit Sheth (Home Page, LinkedIn) is an Educator, Researcher, and Entrepreneur. He is the founding director of the university-wide AI Institute at the University of South Carolina. He is a Fellow of IEEE, AAAI, AAAS and ACM. He has (co-)founded four companies, including the first Semantic Search company in 1999 that pioneered technology similar to what is found today in Google Semantic Search and Knowledge Graph, ezDI which developed knowledge-infused clinical NLP/NLU, and Cognovi Labs at the intersection of emotion and AI. He is particularly proud of the success of his >45 Ph.D. advisees and postdocs in academia, industry research, and entrepreneurs.
Dr. Scott Wen-tau Yih
Research Scientist at Meta AI Research (previously known as FAIR)
Title: Efficient & Scalable NLP through Retrieval-Augmented Language Models
Abstract: While large-scale language models work incredibly well, it is expensive to train them, difficult to explain their predictions, and nearly impossible to keep them current over time. It is unclear when we can trust their predictions, and none of the current large language models can answer questions about current topics, such as COVID-19, since the corpora used for their training were created several years ago. To develop the next generation of general purpose language models with smaller, simpler, and much more efficient models, we believe information retrieval is a key component. When interacting with each other and with the world, humans tap into many different forms of knowledge, including world knowledge (e.g., commonsense, updated world facts, trending news) and user knowledge (e.g., conversational memory, social interactions, additional context such as location, etc.). To incorporate this capability in AI applications, information retrieval provides models access to (potentially large) collections of documents that can contain such knowledge. Specifically, we envision that the complete system consists of a small, core model that can easily access additional, task-related knowledge via retrieval, and perform comparably to the largest language models available today. In this talk, I will first give a research overview of retrieval-augmented language models. Then, I will share some of our recent work, including a general framework that improves any language models by adding a retrieval component, as well as a retrieval-augmented multimodal model that generates images and captions with better quality. Finally, I'll conclude the talk by discussing some of the lessons we learned and the problems we plan to address in the near future.
Bio: Scott Wen-tau Yih is a Research Scientist at Meta AI -- FAIR. His research interests include natural language processing, machine learning and information retrieval. Before joining Meta, Yih was a Principal Research Scientist at the Allen Institute for Artificial Intelligence (AI2), working on scientific question answering. Prior to that, Yih had spent 12 years at Microsoft Research, working on a variety of projects including email spam filtering, keyword extraction and search & ad relevance. His recent work focuses on continuous representations and neural network models for question answering and document retrieval. Yih received the best paper award from CoNLL’11, an outstanding paper award from ACL’15 and has served as program co-chairs (CEAS’09, CoNLL’14, EMNLP’21) and action/associated editors (TACL, JAIR) in recent years. He is also a co-presenter for several popular tutorials on topics including Semantic Role Labeling (NAACL’06, AAAI’07), Deep Learning for NLP (SLT’14, NAACL’15, IJCAI’16), Question Answering with Knowledge Base, Web and Beyond (NAACL’16, SIGIR’16), NLP for Precision Medicine (ACL’17) and Open-domain Question Answering (ACL’20).
Abstract: While large-scale language models work incredibly well, it is expensive to train them, difficult to explain their predictions, and nearly impossible to keep them current over time. It is unclear when we can trust their predictions, and none of the current large language models can answer questions about current topics, such as COVID-19, since the corpora used for their training were created several years ago. To develop the next generation of general purpose language models with smaller, simpler, and much more efficient models, we believe information retrieval is a key component. When interacting with each other and with the world, humans tap into many different forms of knowledge, including world knowledge (e.g., commonsense, updated world facts, trending news) and user knowledge (e.g., conversational memory, social interactions, additional context such as location, etc.). To incorporate this capability in AI applications, information retrieval provides models access to (potentially large) collections of documents that can contain such knowledge. Specifically, we envision that the complete system consists of a small, core model that can easily access additional, task-related knowledge via retrieval, and perform comparably to the largest language models available today. In this talk, I will first give a research overview of retrieval-augmented language models. Then, I will share some of our recent work, including a general framework that improves any language models by adding a retrieval component, as well as a retrieval-augmented multimodal model that generates images and captions with better quality. Finally, I'll conclude the talk by discussing some of the lessons we learned and the problems we plan to address in the near future.
Bio: Scott Wen-tau Yih is a Research Scientist at Meta AI -- FAIR. His research interests include natural language processing, machine learning and information retrieval. Before joining Meta, Yih was a Principal Research Scientist at the Allen Institute for Artificial Intelligence (AI2), working on scientific question answering. Prior to that, Yih had spent 12 years at Microsoft Research, working on a variety of projects including email spam filtering, keyword extraction and search & ad relevance. His recent work focuses on continuous representations and neural network models for question answering and document retrieval. Yih received the best paper award from CoNLL’11, an outstanding paper award from ACL’15 and has served as program co-chairs (CEAS’09, CoNLL’14, EMNLP’21) and action/associated editors (TACL, JAIR) in recent years. He is also a co-presenter for several popular tutorials on topics including Semantic Role Labeling (NAACL’06, AAAI’07), Deep Learning for NLP (SLT’14, NAACL’15, IJCAI’16), Question Answering with Knowledge Base, Web and Beyond (NAACL’16, SIGIR’16), NLP for Precision Medicine (ACL’17) and Open-domain Question Answering (ACL’20).
Prof. Jordan Boyd-Graber
Associate Professor at University of Maryland
Title: Raw Knowledge vs. Understanding: What Adversarial QA Reveals about the Limits of AI
Abstract: Humans' and computers' language abilities are complementary: computers can better memorize, while humans grasp nuance. Building the future of language-mediated interactions with computers requires balancing these abilities. In this talk, I'll discuss our work putting experienced writers in front of a retrieval-driven adversarial authoring system: question writing and fact-checking. For question answering, we develop a retrieval-based adversarial authoring platform and create incentives to get people to use our system in the first place, write interesting questions humans can answer, and challenge a QA system. While the best humans lose to computer QA systems on normal questions, computers struggle to answer our adversarial questions. We then turn to fact checking, creating a new game (Fool Me Twice) to solicit difficult-to-verify claims---that can be either true or false---and to test how difficult the claims are both for humans and computers. We argue that the focus on retrieval is important for knowledge-based adversarial examples because it highlights diverse information, prevents frustration in authors, and takes advantage of users' expertise. We will also tease our next human-computer question answering competition in Spring 2023.
Bio: Jordan Boyd-Graber is an associate professor in the University of Maryland's Computer Science Department, iSchool, UMIACS, and Language Science Center. Jordan's research focus is in applying machine learning and Bayesian probabilistic models to problems that help us better understand social interaction or the human cognitive process. He and his students have won "best of" awards at NIPS (2009, 2015), NAACL (2016), and CoNLL (2015), and Jordan won the British Computing Society's 2015 Karen Spärk Jones Award and a 2017 NSF CAREER award.
Abstract: Humans' and computers' language abilities are complementary: computers can better memorize, while humans grasp nuance. Building the future of language-mediated interactions with computers requires balancing these abilities. In this talk, I'll discuss our work putting experienced writers in front of a retrieval-driven adversarial authoring system: question writing and fact-checking. For question answering, we develop a retrieval-based adversarial authoring platform and create incentives to get people to use our system in the first place, write interesting questions humans can answer, and challenge a QA system. While the best humans lose to computer QA systems on normal questions, computers struggle to answer our adversarial questions. We then turn to fact checking, creating a new game (Fool Me Twice) to solicit difficult-to-verify claims---that can be either true or false---and to test how difficult the claims are both for humans and computers. We argue that the focus on retrieval is important for knowledge-based adversarial examples because it highlights diverse information, prevents frustration in authors, and takes advantage of users' expertise. We will also tease our next human-computer question answering competition in Spring 2023.
Bio: Jordan Boyd-Graber is an associate professor in the University of Maryland's Computer Science Department, iSchool, UMIACS, and Language Science Center. Jordan's research focus is in applying machine learning and Bayesian probabilistic models to problems that help us better understand social interaction or the human cognitive process. He and his students have won "best of" awards at NIPS (2009, 2015), NAACL (2016), and CoNLL (2015), and Jordan won the British Computing Society's 2015 Karen Spärk Jones Award and a 2017 NSF CAREER award.
Prof. Chandan Reddy
Professor at Virginia Tech
Title: Deep Learning for Code Understanding and Generation: Challenges and Opportunities
Abstract: Recent advancements in machine learning have improved the understanding and generation of source code, leading to better performance in various software engineering tasks. Programming language models (PLM) that are pre-trained on large-scale code repositories have shown promising results in various tasks such as code summarization, code translation, and program synthesis. However, current approaches primarily rely on supervised fine-tuning objectives that are directly borrowed from the text generation literature and ignore code-specific features such as syntactic and functional correctness. In this talk, I will introduce various mechanisms of preserving the syntax and data flow of the generated code and then describe our new framework, PPOCoder, that combines pre-trained code PLM with deep reinforcement learning and employs execution feedback as the external source of knowledge into the model optimization process. I will conclude this talk by discussing the CodeAttack framework which is a simple yet effective black-box attack model for generating adversarial code samples that can detect the vulnerabilities in code PLM.
Bio: Chandan Reddy is a Professor in the Department of Computer Science at Virginia Tech. He received his Ph.D. from Cornell University and M.S. from Michigan State University. His primary research interests are Machine Learning and Natural Language Processing with applications to Healthcare, Software, Transportation, and E-commerce. His research has been funded by NSF, NIH, DOE, DOT, and various industries. He has published over 160 peer-reviewed articles in leading conferences and journals. He received several awards for his research work including the Best Application Paper Award at ACM SIGKDD conference in 2010, Best Poster Award at IEEE VAST conference in 2014, Best Student Paper Award at IEEE ICDM conference in 2016, and was a finalist of the INFORMS Franz Edelman Award Competition in 2011. He is serving on the editorial boards of ACM TKDD, ACM TIST, and IEEE Big Data journals. He is a senior member of the IEEE and a distinguished member of the ACM.
Abstract: Recent advancements in machine learning have improved the understanding and generation of source code, leading to better performance in various software engineering tasks. Programming language models (PLM) that are pre-trained on large-scale code repositories have shown promising results in various tasks such as code summarization, code translation, and program synthesis. However, current approaches primarily rely on supervised fine-tuning objectives that are directly borrowed from the text generation literature and ignore code-specific features such as syntactic and functional correctness. In this talk, I will introduce various mechanisms of preserving the syntax and data flow of the generated code and then describe our new framework, PPOCoder, that combines pre-trained code PLM with deep reinforcement learning and employs execution feedback as the external source of knowledge into the model optimization process. I will conclude this talk by discussing the CodeAttack framework which is a simple yet effective black-box attack model for generating adversarial code samples that can detect the vulnerabilities in code PLM.
Bio: Chandan Reddy is a Professor in the Department of Computer Science at Virginia Tech. He received his Ph.D. from Cornell University and M.S. from Michigan State University. His primary research interests are Machine Learning and Natural Language Processing with applications to Healthcare, Software, Transportation, and E-commerce. His research has been funded by NSF, NIH, DOE, DOT, and various industries. He has published over 160 peer-reviewed articles in leading conferences and journals. He received several awards for his research work including the Best Application Paper Award at ACM SIGKDD conference in 2010, Best Poster Award at IEEE VAST conference in 2014, Best Student Paper Award at IEEE ICDM conference in 2016, and was a finalist of the INFORMS Franz Edelman Award Competition in 2011. He is serving on the editorial boards of ACM TKDD, ACM TIST, and IEEE Big Data journals. He is a senior member of the IEEE and a distinguished member of the ACM.
- © All rights reserved
- Design: HTML5 UP