The Road to Safe and Trustworthy Artificial Intelligence

Tuesday 21/02/2023 9:00-14:00

Estonian Academy of Sciences, Kohtu 6, Tallinn Old Town

Registration

Programme

Time
09:00 - 09:25	Coffee
09:25 - 09:30	Welcoming remarks
09:30 - 10:30	Christoph Lütge - Creating safe AI: Why AI needs (business) ethics
10:30 - 11:00	Coffee
11:00 - 12:00	David Dalrymple - An Open Agency Architecture for Safe Transformative AI
12:00 - 12:30	Coffee
12:30 - ?	Panel Discussion

Invited Speakers

Christoph Lütge studied business informatics and philosophy. Since 2010, he holds the Chair in Business Ethics at the Technical University of Munich (TUM), and since 2019, he is also the Director of the TUM Institute for Ethics in AI. Recent books include “An Introduction to Ethics in Robotics and AI” (with coauthors) and “Business Ethics: an Economically Informed Perspective” (with M. Uhl). He is a member of the Scientific Board of the European AI Ethics initiative AI4People as well as of the German Ethics Commission on Automated and Connected Driving.

Creating safe AI: Why AI needs (business) ethics, 09:30-10:30

The increasing presence of artificial intelligence in a variety of fields (e.g. health care, recruiting, mobility) is associated with moral and ethical questions and problems that are only just beginning to be explored. For example, issues include the emergence of deepfakes, the creation of responsibility gaps, implicit discrimination and biased algorithms. This Safe AI workshop summarizes a few opportunities and challenges of the use of AI from a moral and societal perspective. Furthermore, AI ethics fundamentals such as key ethical principles and the EU AI act are introduced to help companies govern the responsible and safe adoption of AI.

David A. Dalrymple (also known as “davidad”) has backgrounds in theoretical computer science, applied mathematics, software engineering, and neuroinformatics. In 2008 he was the youngest person to receive a graduate degree from MIT, and he went on to study biophysics at Harvard. David has also worked in machine learning and software performance engineering at major tech companies and startups alike. Amongst several other roles, he is currently a Research Fellow at the Future of Humanity Institute in Oxford, UK.

An Open Agency Architecture for Safe Transformative AI, 11:00-12:00

I will present a detailed sketch of a proposed framework for using increasingly powerful AI capabilities while avoiding failure modes such as goal misgeneralization and specification gaming. There are four main functional phases: understanding the problem (world modeling and requirements elicitation), devising solutions (training AI policies), examining solutions (human interpretation, review and deliberation), and deploying a solution (without monolithic or unbounded optimization). For each of these phases I will outline some preliminary ideas for how they might be realized in a relatively scalable way by combining state-of-the-art AI paradigms with human feedback and/or formal verification (depending on the phase). Interspersed in the talk, I will also propose avenues by which directions in applied category theory seems promising to unblock some core bottlenecks for this research agenda.