- Daniel Jackson, “Software Design, Concepts and AI”
- Maurizio Lenzerini, “Conceptual Modeling and Knowledge Representation: a journey from Data Modeling to Knowledge Graphs”
- Walid S. Saba, “Reverse Engineering of Language at Scale: Towards Symbolic and Explainable Large Language Models”
- Catia Pesquita, “True or False? The impact of negative knowledge in biomedical artificial intelligence”
“Software Design, Concepts and AI”
Massachusetts Institute of Technology (MIT)
We’ve known since the 1970s how important conceptual models are in the design of software. If a system’s conceptual model is too complex to grasp, or isn’t faithfully projected in the user interface, usability suffers. Despite lots of progress in conceptual modeling, two central aspects have not been addressed. First, we’ve often assumed that the conceptual model is given—defined by the problem domain or by an existing mechanism—when in fact it is usually explicitly designed. Second, although many representations have been proposed, none of them separated out the individual concepts, allowing them to be analyzed and reused in a modular way.
In this talk, I’ll explain a new approach to software design that centers on the design of individual concepts, which are composed together to form a system. I’ll show how this allows usability problems to be diagnosed more effectively, stimulates new designs that work more effectively, and allows apps to be constructed with a more modular structure that has better separation of concerns and less coupling. I’ll also explain how LLMs can be used synergistically in design by concept.
Daniel Jackson is a professor of computer science at MIT and associate director of CSAIL. For his research in software, he won the ACM SIGSOFT Impact Award, and the ACM SIGSOFT Outstanding Research Award and was made an ACM Fellow.
He is the lead designer of the Alloy modeling language and author of Software Abstractions. He chaired a National Academies study on software dependability and has collaborated on software projects with NASA on air-traffic control, with Massachusetts General Hospital on proton therapy, and with Toyota on autonomous cars.
His most recent book, Essence of Software, offers a fresh approach to software design and shows how thinking about software in terms of concepts and their relationships can lead to more usable and effective software.
“Conceptual Modeling and Knowledge Representation: a journey from Data Modeling to Knowledge Graphs”
Department of Computer, Control, and Management Engineering of Sapienza University of Rome
While data constitute one of the most important components of an information system, many research efforts today focus on Machine Learning models and algorithms, with the properties of data feeding such algorithms playing a secondary role. Thus, shifting the attention to data has been recently proposed as one of the most timely topics in Data Analytics and Artificial Intelligence (AI) research, under the name of Data-Centric AI. Arguably, the field of Conceptual Modeling (CM), and in particular its connection to the area of Knowledge Representation and Reasoning (KRR), can provide important contributions towards shaping the research on Data-Centric AI. In this talk I will try to summarize the most important steps of the research done at the crossing between CM and KKR in the last decades, from the early work on Data Modeling and Semantic Networks to the investigation on ontologies and Knowledge Graphs.
Maurizio Lenzerini is a Professor of Data and Knowledge Management at the Department of Computer, Control, and Management Engineering of Sapienza University of Rome. His research interests lie at the intersection of Artificial Intelligence and Data Management, with emphasis on Conceptual Modeling, Knowledge Representation, Automated Reasoning, Knowledge Graphs, Ontology-based Data Access and Integration. He is the author of more than 300 publications on the above topics, and has delivered around 40 invited talks. According to Google Scholar he has an h-index of 82, and a total of 29806 citations (March 2023). He is a member of the Academia Europaea – The European Academy and the recipient of two IBM Faculty Awards, of the Peter Chen Award and of the ER (Entity-Relationship) Fellows Award. He is a Fellow of the Asia-Pacific Artificial Intelligence Association (AAIA), of EurAI (European Association for Artificial Intelligence), of the ACM (Association for Computing Machinery) and of AAAI (Association for the Advance of Artificial Intelligence).
“Reverse Engineering of Language at Scale: Towards Symbolic and Explainable Large Language Models”
Walid S. Saba
Senior Principal Scientist, Experiential Institute for Artificial Intelligence, Northeastern University
Scientific explanation proceeds in one of two directions: by following a top-down strategy or a bottom-up strategy. For a top-down strategy to work, however, one must have access to a set of general principles to start with and this is certainly not the case when it comes to thought and how our minds externalize our thoughts in language. Lacking any general principles to start with, a bottom-up approach must be preferred in the process of discovering how language works. As such, we believe that the relative success of large language models (LLMs), that are essentially a bottom-up reverse engineering of language at scale, is not a reflection on the symbolic vs. subsymbolic debate but is a reflection on (appropriately), adopting a bottom-up strategy. However, due to their subsymbolic nature, LLMs are not really models of language, but statistical models of regularities found in language and thus whatever knowledge these models acquire about how language works will always be buried in billions of microfeatures (weights), none of which is meaningful on its own. Because they are incapable of maintaining the compositional structure of language, LLMs can never provide an explainable theory of how language works. To arrive at an explainable model of how language works, we argue in this talk that a bottom-up reverse engineering of language at scale must be done in a symbolic setting. Hints of how this should be done can be traced back to Frege, although it was subsequently and more explicitly argued for by Sommers (1963), Hobbs (1985) and Saba (2007).
Walid Saba is a Senior Research Scientist at the Institute for Experiential AI at Northeastern University. Prior to joining the institute in 2023, he worked at two Silicon Valley startups, focusing on conversational AI. This work included high-level roles as the principal AI scientist for telecommunications company Astound and CTO of software company Klangoo, where he helped develop its state-of-the-art digital content semantic engine (Magnet).
Saba’s career to date has seen him hold various positions in both the private sector and academia. His resume includes entities such as the American Institutes for Research, AT&T Bell Labs, IBM and Cognos, while he has also spent a cumulative seven years teaching computer science at the University of Ottawa, the New Jersey Institute of Technology (NJIT), the University of Windsor (a public research university in Ontario, Canada), and the American University of Beirut (AUB).
He has published over 45 technical articles, including an award-winning paper that he presented at the German Artificial Intelligence Conference (KI-2008). Walid received his BSc and MSc in Computer Science from the University of Windsor, and a Ph.D in Computer Science from Carleton University in 1999.
“True or False? The impact of negative knowledge in biomedical artificial intelligence”
LASIGE, University of Lisbon
Most of our data is about positive facts: a patient has hypertension, the BRCA2 gene is related to breast cancer, Lisbon is the capital of Portugal.
In many applications, the assumption is made that everything that is not stated is false (the closed-world assumption), but for real-world and critical domains, such as those in biomedical research and healthcare, conflating what we don’t know with what is false carries a high risk: patients with unreported symptoms can be given the wrong diagnosis, drugs with unknown interactions can be prescribed in tandem.
Knowledge graph-based machine learning applications are a prime example of this mismatch between algorithms that operate under the closed-world assumption and real datasets that are open-world.
In this talk, I will discuss the challenges faced by machine learning and artificial intelligence applications over knowledge graphs when the difference between a negative fact and an unknown fact is crucial. We will further explore what is negative knowledge, why it is important, how it can be harnessed, and what are we missing when we ignore it. The discussion will be supported by real use cases in biomedical research and healthcare.
Catia Pesquita is an Associate Professor at the University of Lisbon, where she leads the Research Line of Excellence in Health and Biomedical Informatics at LASIGE.
She is deeply interested in how artificial intelligence can be harnessed for scientific discovery, particularly in the life and health sciences. She has made significant contributions in data analytics, data integration and machine learning with ontologies and knowledge graphs, being recognized as a Top 2% most cited researcher in Artificial Intelligence (according to Scopus data). She is an Associate Editor at BMC Bioinformatics and has held General Chair, Program Chair and Track Chair roles at ESWC and ISWC conferences.
Her multidisciplinary background has given her a unique perspective on the interplay between knowledge representation, artificial intelligence and the natural sciences, which inspires her research, teaching, and speaking roles.