Can artificial agents make moral decisions? Do we want them to make moral decisions? When? For what? If so, whose ethics, whose values, are going to be brought in?

WENDELL WALLACH: Thank you very much, Joel. I'm thrilled to be here. I'm increasingly becoming a citizen of the world. It used to be that you didn't want to be on panels if there wasn't gender representation. Now I feel like I don't want to be on panels unless there is regional representation, particularly Asia and the Arab world, so that this whole subject matter isn't totally dominated by Americans and Europeans, which unfortunately it has been up to this point.

As your opening presenter, I know there are so many fascinating nuanced issues and philosophical quandaries and policy considerations we could get into, and I hope that that's what's going to happen over the next two days. But as your opening presenter, I thought perhaps the best contribution that I could make was to make sure that we're all talking the same languages or we all understand the various distinctions that have been coming to the fore. So please forgive me if I'm repeating material that most of you understand very well.

I decided in my 70th year that when I grew up I wanted to be a silo buster. That has become my main profession in this day and age, is seeing if we can have true trans-disciplinary dialogues and seeing if we can have those dialogues in a way that break through our silos and get us understanding the differences when we start using terms, such as "utilitarian function" or "utilitarian principles" that are getting used in totally different ways by people in artificicial intelligence (AI) and ethics. Often they don't even understand that until they've been battling with each other for a few hours.

This subject area of moral artificial intelligence has become vast with many subsets within it. So let's start out with one primary distinction. Moral AI is about mitigating harmful AI or mitigating the ways in which AI might undermine some of our greatest goals.

Just the day before yesterday I was talking at the UN in Geneva about the application of AI for good in the context of the Sustainable Development Goals. I trust all of you know that that's the 17 goals that together are hopefully a way of raising the lot of the billions of our less fortunate compatriots on this planet in a fundamental way.

I underscored how emerging technologies are a factor in nearly every single one of those goals, and though AI isn't at the center of the ways in which technology may be applied, AI may actually amplify other technologies or bring in efficiencies to the application of those technologies.

But I also focused on the SDG Numbers 8 and 10. SDG 8 is decent work and 10 is about decreasing inequalities. I'm presuming everybody in this audience understands that there is no guarantee that AI is going to be beneficial in either of those regards, or if it is to be beneficial it is going to have to be mitigated by other public policies that we put in place.

Furthermore, if AI exacerbates our ability to address Sustainable Development Goals 8 and 10, then, in turn, it will undermine our ability to respond to SDGs 1 and 2, which are about decreasing poverty and hunger and so forth. When we have these inequalities, it's clear who suffers; it's always the same elements of the human community who suffer.

So on one level we are talking about mitigating harms, and mitigating harms sometimes from the perspective of public policy, but perhaps also mitigating harms from the perspective of how we implement artificial intelligence. In that regard a number of strategies have come to the fore for that mitigation in the way we develop artificial intelligence.

One is called value-added design. Value-added design means that values are part of the design process for engineers. For some cases this includes making particular values explicit design specifications.

In that regard we have Jeroen van den Hoven from the University of Delft who has been very important. He proposes, "Why don't we make privacy a design specification?" I have suggested that we make the clarification of who will be the responsible agent if the system fails a design specification. And others, including people like Rob Sparrow and others in this room, have underscored the ways in which additional values should be brought into the design process itself. So just as now when you develop an artifact, a technological system, it's a design specification that it not overheat or be safe, it could also be a design specification that fulfills certain ethical requirements.

For example, if we made the determination of the responsible agent up-front within the design, that may actually direct us to very different platforms that we build upon. Particularly for some corporations, it may direct them away from platforms where they might be the culpable agent. But that's not all bad, because that also may direct us away from platforms where the systems themselves are harmful, potentially not controllable or potentially lack transparency, as we are now witnessing with emerging forms of artificial intelligence. I'll get back to that point later.

Another area we can move forward is whether we can imbue the artificial intelligent agents with a sensitivity to value considerations, with a sensitivity to ethical considerations, and a capacity to factor those ethical considerations into their choices and actions. Agents can be ethical in all kinds of ways. The philosopher Jim Moore from Dartmouth coined a language where he talked about implicit and explicit agents.

Implicit agents were those that had an ethical impact even though ethics was not necessarily their concern. His classic example was in the Arab world how you had jockeys in camel races and now the jockeys are becoming robots. This has an impact, for example, on the livelihood or the safety of the jockeys that used to be human. This is just a small example, but you get the idea.

I'm emphasizing these words or terms largely because they have become well-entrenched in the conversations we are having about artificial intelligence.

Moore also used the term explicit moral agents. Explicit moral agents are those that actually engage in moral decision-making.

As many of you are aware, some of my early claim to fame in this field is due to a book that Colin Allen and I co-authored, called Moral Machines: Teaching Robots Right from Wrong. That was an early attempt to map what was the new field of machine ethics and machine morality, the prospects or the possibilities that artificial intelligence and robotics could be sensitive to moral considerations and factor them into their choices and actions.

The subtext of the book was about how humans engage in moral decision-making. I think in a strange sense the subtext may have been even more important than the explicit book, because I believe we were the first people to think comprehensively about how humans make moral decisions. This was in a period when research in moral psychology was just beginning to flower, and we were able to bring that into play. We weren't just discussing the applications of moral theories for decision-making. We were also talking about the application of what we called "super-rational faculties" or faculties beyond the capacity to reason. That included emotions, theory of mind, empathy, appreciating the semantic content of information, being embodied in a world with other agents, being a social being and part of a socio-technical system—not just an autonomous or isolated entity within a social environment.

It's those kinds of considerations that I think actually made the book most fascinating to its early readers. It became a way that you could talk about ethics to people who would squirm if you talked directly to them about ethics. Many people, of course, think that ethicists are engaged in politics and just trying to change their worldview, getting them to buy into your world view.

This has been one of the really constructive things that has gone on in the discussion of ethics and artificial intelligence. Artificial intelligence has become this mirror confronting us to think very deeply now about the ways in which we are similar to and the ways in which we may truly differ from the artificial entities we are creating.

This somewhat imperfect mirror has become a focus for truly fascinating dialogues, particularly with those of a more—what shall I say?—a more scientific mindset that is embracing a simplistic attitude that we are all just digital machines and that everything we do can be reproduced, or simulated, computationally—regardless of the fact that we really don't fully understand how our brain operates right now, nor do we have the science to know whether we could actually realize human capabilities fully within artificial intelligence. Within the AI community it is more or less presumed that human capabilities can be reproduced computationally, and it is an uphill battle and a deep philosophical battle whether that presumption is well-founded or whether it is based on very simplistic assumptions.

In Moral Machines we introduced a few questions and distinctions that have become quite important. The first question we asked was: Can artificial agents make moral decisions? Do we want them to make moral decisions? When? For what?

We then posed the age-old question: If so, whose ethics, whose values, are going to be brought in?

Our final question is whether we could actually computationally instantiate moral decision making.

We made a couple other distinctions up-front. One was what we called "operational morality," which was basically what we have today. The designers more or less know the contexts in which these systems will operate. That is becoming less true than it used to be, but they can often pre-discern the challenges and they can hard-program in what the systems do when they encounter those challenges. Today's systems commonly operate within boundedly moral contexts, basically closed contexts.

The difficulties come when the systems move into much more open contexts, or their increasing autonomy and increasing sophistication, means that even the designers can't predict what they will do, what action they will take. Under such circumstances factors come into play that had not necessarily been discerned by the engineers and designers in advance. Such situations give rise to the more interesting subject of explicit moral decision making, which we called "functional morality."

Beyond functional morality is everybody's favorite topic: full moral agency, full human-level intelligence, beyond-human-level intelligence, the singularity, and super-intelligence. I try to play down these more speculative possibilities—not because I know any better than any of you when or whether they will be realized, but because we have a lot of thresholds to cross before then. I get a little frustrated when speculative possibilities dominate our conversation and we lose sight of the plethora of ethical considerations that are coming into play right now.

Simple ones, such as: If we are going to deploy robots to take care of the homebound and the elderly, what do the robots do when the homebound and elderly reject the medicines that they have been trained to bring a person three or four times a day? That's a major issue within medical ethics, for those of you who have a relationship to that subject area. On one level it's a simple problem, not necessarily a simple problem to fully solve what the robot should do.

A problem is the application of trolley car problems to self-driving cars—an example which I basically dislike for a number of reasons that we might return to later. To make matters worse, when the MIT researchers put up a website where they gave people examples of trolley car-like problems with self-driving cars, they chose to call it "moral machine." They knew where they had gotten the term from, but this has created a side concern for me.

In addition to the distinction between "operational morality," "functional morality," and "full moral agency," I want us to quickly cover two terms "top-down" and "bottom-up." Both of those are approaches for dealing with the question of what role ethical theory should play in the development of moral agents.

Top-down is the implementation of an existing ethical frameworks—which could be the Ten Commandments, Yamas/Niyamas, consequentialism, or a deontological framework, even Kant's categorical imperative. What everybody thinks of first in this context is Asimov's "Laws of Robotics," initially three and later four.

Bottom-up refers to the fact that most of us did not come into this world with moral sensitivity—if anything, we're all still trying to develop it. Bottom up is a process of learning, of development. Some foundations for moral development may have been even been laid through genetic processes, a possibility that is studied in evolutionary psychology.

In Moral Machines, Colin and I considered both top-down and bottom-up approaches to developing moral acumen. I'm not going to go into any of that in greater detail. Here I merely intend to give you a quick sense of the lay of the land.

Another piece that we brought in, which I've already alluded to, was super-rational faculties, faculties beyond reason. After we, in Moral Machines, looked at the implementation of top-down and bottom-up approaches, including virtue ethics—which arises in many different traditions from Confucianism, to Buddhism, to Aristotle's Nicomachean Ethics, and which we saw as a hybrid approach that brings both top-down and bottom-up together—we considered the role moral decision-making of faculties that are usually taken for granted in moral discourse when talking about human beings, but which can't be take for granted when you are talking about robots. The robots will not have these supra-rational capabilities unless we implement them. A fundamental question is whether one can implement each and get the same functional capabilities that we get within human beings.

Can you implement something like empathy? I'm concerned that a lot of robots out there are faking empathy, which may be helpful in social robotics for facilitating the way we interact with them, but robots should not be pretending to be empathetic when it's clear that they do not have any capacity to feel what the other person is feeling. That's a very dangerous road for us to go down.

That's the basic framework.

Now, what had happened in the development of artificial agents—not very much. In fact, our book was so far ahead of its time that it's perhaps just becoming contemporary.

Peter Asaro, Rob Sparrow, and Bill Casebeer, who are with us today, and a number of others have been in the constant development of this new field of enquiry. But it has largely been a space for reflection for philosophers to lay out what the landscape is, and for a few computer theorists to think through initial projects, only a few of which have received substantial investment to date.

All of that has changed in the last few years with the advent of Deep Learning and other breakthroughs in artificial intelligence. These are truly significant breakthroughs, but perhaps overly hyped, and have implied that we are much further along in the development of artificial intelligence than we are.

Nevertheless, every time there is a breakthrough in artificial intelligence, the super-intelligence flag gets waved and somebody declares, "That's going to solve all our problems," and the next person, whether it's Elon Musk or Stephen Hawking or whoever, warns, "Yeah, but it could go wrong."

As Stuart Russell likes to say, "We've been working for years to develop artificial intelligence, but we never really thought through what would happen if we succeeded." For that reason, Stuart Russell, who some of you may be aware co-wrote, with Peter Norvig, the textbook on artificial intelligence that everybody learns about the field from, started to become concerned. He's truly a moral man. He was concerned about what could go wrong.

Russell decided that the narrow focus upon developing functional capabilities within artificial intelligence was potentially dangerous, and engineers needed to change the trajectory of the research field so it focused more on safety. When one approaches bridge building, safety is an intrinsic design specification—you don't even have to state it. In the development of AI longer-term safety considerations had not yet been given adequate attention.

He also declared that developers need to focus on what he called "value alignment." To put value alignment in context, one of the concerns that came up in the community that was particularly anxious about super-intelligence was whatever kinds of controls or restraints we put in it, it's going to find ways of working around those. Not only that, once an AI system reaches human-level intelligence, there will be an intelligent explosion because then you'll have robots and artificial intelligence 24/7 getting smarter and smarter and smarter, and leaving us far behind. As Marvin Minsky, one of the fathers of AI, once said, perhaps they will treat us as the equivalent of house pets. Keep that in mind when you think about how you treat animals.

In any case, Stuart Russell came up with this term "value alignment." Those of you who know a little bit about the machine learning breakthrough understand that this machine learning breakthrough is largely about the input of massive amounts of data into relatively simplistic neural networks—or at least networks that portend to capture some of what goes on in the brain—and after processing by these simulated neuronal layers there is an output of data. In the optimistic days, we thought we could input every kind of information and the output would be intelligent.

In Stuart Russell's first iteration of value alignment—which I heard as just another form of machine ethics or machine morality or computational morality, what we had already outlined many years earlier—he thought that the machines just had to observe human behavior and that they would develop their own intelligence. The idea was that if they could learn to align themselves with human values—this is the language coming out of the AI community versus the philosophical computer community that had been working on this problem for 10 years—then the control problem for when systems have full human-level artificial intelligence would have been solved.

I went up to Stuart immediately after hearing him use the term value alignment for the first time and said, "That sounds like a bottom-up approach to machine morality or machine ethics"—I forget which term I used at the moment. Any of you who know Stuart Russell know he is British, urbane, never out of control. He gave me the doe-eye look, like somebody had shined a light in his eyes. He had no idea what I was talking about.

That led to our creating a project together called Control and Responsible Innovation in the Development of Artificial Intelligence. We received some of Elon Musk's money that he had given to the Future of Life Institute.

A series of three silo-busting workshops was convened bringing together people who had created a variety of fields addressing different aspects of concern—everything from engineering ethics to resilience engineering and of course including many of the leaders in the AI community. Most of the leaders from the differing fields had never met each other. Even though they were talking about roughly the same problems, they weren't necessarily talking about them in the same language.

That's where we're at right now. Luckily, we've moved along far enough that my workshops are no longer groundbreaking, because there's an awful lot of AI conferences, like this one, that have a trans-disciplinary flavor.

The "value alignment" language is still the predominant language within the AI community. This year, for the first time the Association for the Advancement of Artificial Intelligence (AAAI), which is the leading AI professional association held a workshop on the subject in New Orleans before the yearly annual meeting. They asked researchers and philosophers and others to submit papers, and they had 160 papers of which they could only highlight 37 of them within the workshop itself. That gives you a little feeling for the explosion of what's happening in this subject area.

Let me finish up with one last thing, which is the efforts on the other side of the coin—not just imbuing the sensitivity within the systems themselves, but what kind of oversight we are hoping to put into place.

As mentioned earlier, I'm coming to you directly from a conference at the UN in Geneva called AI for Good. At that conference I introduced a new distinction between outwardly turning AI for good versus inwardly turning AI for good.

Outwardly turning is where you are truly focusing upon explicit applications that can make a difference. One of my favorites, which if I had more time, I would tell you about in greater detail, is a project in Africa where they are giving insurance to farmers by packaging it with the seeds and fertilizer they buy. The farmers register the insurance policy on their cellphones, and then the company oversees cloud patterns in Africa and uses machine learning algorithms to deduce whether there's enough rainfall for the crops. If there is not enough rainfall, they immediately send a certificate, a coupon, through the phone so that the farmers can go back and get a free bag of seeds and/or fertilizer for the next season, solving a major problem that happens periodically to poor farmers. 611,000 farmers are now involved in this program.

This is an example of outwardly turning AI for Good as it addresses the use of the technology to solve a discrete problem. But I wanted to underscore the fact that also mitigating harms cause by AI must be as fundamental, otherwise the harms could quickly overwhelm the goods that can be realized. Mitigating harms and undesirable societal consequences is inwardly turning AI for Good.

Gary Marchant and I have been proposing a new model for the agile and comprehensive governance of AI, a model that we created originally for any emerging technologies. We recognized existing government frameworks just do not work well in the context of emerging technologies. We should be less focused on law and regulations and bring other mechanisms into play that could be much more agile. I'm happy to talk with you further about that, perhaps during Q&A.

More recently, the World Economic Forum and others are adopting that framework as a way of getting people to think creatively about governance for a wide variety of areas from food security to governing the oceans. Gary and I felt that it was necessary to work through the problems of this kind of governance, and therefore proposed pilot projects, one in AI and robotics and the other in gene editing and synthetic biology.

When we first proposed this we were thinking in a national context, thinking that these ideas could help people creatively think about governance in any nation; and, if many countries started such initiatives, they could begin cooperating with each other. What we had proposed were coordinating committees that would function as multi-stakeholder forums, looking for gaps and finding ways of addressing what could go wrong.

More recently it became clear that if we put in place a U.S. pilot project, good luck in whether we could get Japan, the United Arab Emirates (UAE), Russia, or China to join in. Thus, it was important that this pilot begins as an international project. That's what I'm most focused on at the moment, whether we can convene a multi-stakeholder global governance congress for AI. I'm quite hopeful that we will be able to do this within a year from now.

But it's still not totally clear who may or may not host it. There are ongoing conversations with different countries and major players. The Institute of Electrical and Electronics Engineers (IEEE), the Partnership on AI, the United Nations, the World Economic Forum—they are all interested in governance of AI and will hopefully play a role in moving this project forward. We will see how that evolves.

The next stage in the ethical development of AI should be focusing on discrete applications for good that can be quickly deployed, focusing on how far we can get in implementing sensitivity to moral considerations so it can be factored into the choices and actions of artificial intelligence, and focusing on the comprehensive and agile oversight of developments in this field.

Hopefully I didn't take too much of your time, but thank you very much.

You may also like

DEC 17, 2024 Feature

Empowering Ethics in 2024

Explore Carnegie Council’s 2024 Year in Review resource which highlights podcasts, events, and more covering some of this year’s key ethical issues.

Dr. Strangelove War Room. CREDIT: IMDB/Columbia Pictures

DEC 10, 2024 Article

Ethics on Film: Discussion of "Dr. Strangelove"

This review explores ethical issues around nuclear weapons and non-proliferation, the military-industrial complex, and the role of political satire in Stanley Kubrick's "Dr. Strangelove."

DEC 3, 2024 Article

Child Poverty and Equality of Opportunity for Children in the United States

This final project from the first CEF cohort discusses the effects of child poverty in the United States and ethical solutions to help alleviate this ...

Not translated

This content has not yet been translated into your language. You can request a translation by clicking the button below.

Request Translation