PONI Live Debate: AI Integration in NC3

Photo: CSIS
Available Downloads
This transcript is from a CSIS event hosted on January 24, 2025. Watch the full video here.
Heather Williams: Good morning. And welcome, everybody, to CSIS. My name is Heather Williams. I am the director of the Project on Nuclear Issues here at CSIS, affectionately known as PONI. And we are really delighted to welcome you to CSIS today, to both the folks in the room but also to everybody online and joining live, for this debate about reliance on artificial intelligence in nuclear command control and communication, NC3. You’re going to be hearing the acronym a lot.
I have just a couple housekeeping items to go over, and then we will jump into the substance of the debate. This debate is on the record and is being recorded. After the formal debate we will have time for a discussion and Q&A. If you are in the room, then please use the QR code that is behind you. And there is an online form for folks joining there. I need to share with you our building safety precautions for those of you who are in the room. Overall, we feel secure in our building, but as a convener we have a duty to prepare for any eventuality. I will be your responsible safety officer for this event. And please follow my instructions should the need arise, and please make sure you know where the closest exit is. It’s probably either behind you or behind me.
And so now we’ll turn over to the program into our 2025 PONI debate. The PONI debate series began in 2009 to encourage a dynamic and free-flowing exchange of ideas about some of the most pressing nuclear issues. And today’s worsening security environment, nuclear saber rattling, and emerging technologies make these debates more timely than ever. Our last debate was on U.S. nuclear targeting policy with Frank Miller and James Acton in January of 2024, which you can find online if you are interested.
The CSIS Project on Nuclear Issues was founded in 2003 to develop the next generation of nuclear policy, technical, and operational experts. And today in the room we have our 2025 Nuclear Scholars Initiative class, welcome to all of you and thank you for being here, along with many folks and our Mid-Career Cadre. Thank you all for joining and being part of the PONI family. And so the Nuclear Scholars Initiative is a core part of our programming that provides young professionals with a unique venue to interact and engage in dialogue with senior experts.
Today’s debate is going to focus on AI integration in NC3. Just two months ago, my colleague Kari Bingen here at CSIS led a discussion with General Cotton of U.S. Strategic Command where he outlined a bit of a vision for taking advantage of AI and new technologies to increase efficiencies in the nuclear enterprise. And just two days before that, former President Biden and Chinese President Xi Jinping made a joint statement that affirmed the need to maintain human control over the decision to use nuclear weapons. This all comes at a time as the entire U.S. nuclear triad is modernizing. And that includes the nuclear command control and communication systems.
The emerging vision is that AI will influence nuclear decision-making processes, whether decision-makers realize it or not. And the question is where and how it will enhance decision-making and what are the risks associated with that, and what are the risks that we can live with? To better understand this challenge, I am really thrilled for today’s debate and to host our speakers, Sarah Mineiro and Paul Scharre. Sarah is a nonresident senior associate with the Aerospace Security Project here at CSIS and is the founder and CEO of Tanagra Enterprises. Paul is the executive vice president and director of studies at the Center for a New American Security, CNAS. And I’m also delighted to have Chris Andrews as our discussant for the debate. Chris is a fellow at the National Defense University and is a member of our Mid-Career Cadre.
The motion on the table today. The motion is that the U.S. should increase its reliance on artificial intelligence to enhance decision-making in its NC3 systems. The format of the debate will go like this: Sarah will speak in the affirmative for the motion and Paul is going to speak in the negative. They will each begin with eight minutes of opening remarks, which they’ll do from here. And then we’ll sit back down, have a little conversation, and they will each then have four minutes of rebuttal to what they heard from each other.
And then Chris is going to serve as the discussant and ask them each a question, and they will have some time to answer that. Once we get to that point in the debate is when we will open it up to you all. We really do want to hear from you. What are the burning questions you have? What are questions you have coming out of the discussion? And I’ll moderate those as they come through. And then at the very end they will have some time for their very final remarks. And we will end at 10:00.
So with that, I’m going to turn it over to Sarah to get us started with her opening comments. Sarah, please.
Sarah Mineiro: Good morning. I’m shorter. Also not wearing heels. First, I just want to say thank you for being here, for attending, both in person and online. The discussion of these issues is vitally important to our national security. There has been, I think, widely acknowledged kind of a deficit in the interest of young people in these kinds of debates, that can seem either too historically rooted or religious in nature and not grounded in actual methodology and praxis. And so I’m always thrilled to come here and support PONI in what they’re doing, because we need more critical thinkers in this area. So let me just start with that.
While I thank you, I am also going to say I’m dismayed at being – like, having flashbacks to high school debate teams. (Laughter.) Which is not always how I want to start my mornings. But this is going to be awesome. So my name is Sarah. I’m going to be speaking in the affirmative of this case, I will tell you that I’m not going to bury the lead and that my assertion is probably a little exaggerated for the purposes of debate, but I’m a sparkly personality so here we go. (Laughter.) My basic assertion is that AI tools and techniques are appropriate for use across just about the entire NC3 system, with the exception of authorizing automated weapons release authority for the employment of nuclear weapons with humans not on the loop, right?
To get started, I mean, I think we all need to acknowledge that there are several different and varying definitions of what artificial intelligence is. And Paul has literally written a book on this, so he will win that portion of the debate. I will tell you that, for my purposes, what I am using is basically the definition that comes out of Title 15, Section 9401, Subsection 3. It is also the definition that was used in the executive order that was issued by President Trump yesterday as well. I read that. I will be honest. I’m not sure what it says. I’m sure it says insightful things. (Laughter.)
The term “artificial intelligence,” for that definition, means a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine and human-based inputs to perceive real and virtual environments, abstract such perceptions into models through analysis in an automated manner, and use model inference to formulate options for information or action. OK, so, I mean, the first thing that you’re taught in a debate is to define the terms. So there’s that.
The second term that I think is really important to define before we get started is what is an NC3 system? What does that look like, right? The DOD has actually been very helpful here in defining and having very public, kind of, documents about what nuclear command and control systems are. I will tell you that there are over 200 programs of record in the Department of Defense, everything from radios all the way up to space-based sensing architectures. And essentially there’s two layers of kind of NC3 systems.
One is that kind of persistent sensing and communications layer, and the second is really called the thin line, which is it must be able to endure through not only initial strike, but counter strike. There are really – the nuclear surety requirements right, are really defined as an always/never situation. They must always be available to be used and employed by the authorizing authority, who is the president of the United States. They may never accidentally be employed or be employed when that is not the case, right? And so those are just some basic definitions there.
My three basic arguments to support why nuclear command and control is appropriate – is a mission set that appropriately lends itself to artificial intelligence tools and techniques. The first is that AI tools and techniques are already used throughout the – almost the entire stack of nuclear command and control, both on the hardware and the software aspects of that. The second is that AI tools and techniques can actually help to expand the decision space for the human decision-maker that needs to make decisions about the appropriate use and response to use of a nuclear weapon. And the third is that AI tools can actually help the national security community better model, plan, exercise, and otherwise increase the readiness of our nuclear forces and planners.
So I’m going to just kind of expand on all of those. The first argument is that AI tools are already used throughout the nuclear weapons enterprise and the – specifically the NC3 enterprise. This is just kind of a blanket fact. AI tools are already used to do things like design and engineer the integrated circuits for our CPUs, for GPUs, on the hardware side of the equation. They are used and employed for robust image and signal processing, the analytics for the missile warning, sensors and capabilities, the integrated threat warning attack and assessment certified systems all use AI to be able to push data and analyze it in a way that is actually usable to the forces that need it. Modern space-based sensing systems already employ AI tools and techniques through their entire stack of their OSI model, for everything from the physical layer all the way up to the application layer. And in this way, AI tools and techniques are already employed in the NC3 system.
The second argument is that AI tools and techniques can actually help to expand the decision space for human decision-makers. There is no doubt that when you talking about the employment of nuclear weapons these are questions of existential humanity. And they deserve every second that they can to be able to postulate not only the consequence of an incoming strike, but also what the counter response is. That fixed time of an ICBM entering and requiring a response is generally thought to be around 25 to 30 minutes, with very little flex. The argument here is that you should use AI tools and machine learning techniques to be able to process data, do characterization, to do data analytics, to determine flight paths, so that you can actually expand the decision space for the human who needs to make a decision about what is going to happen after that incoming, right?
This is an argument of data processing, of where you want to do that data processing, of allowing machines to do what machines do best which is, quite frankly, right, pattern deviation, characterization, and identification. If you can have a machine do that, and then allow a human who hopefully – and, you know, will actually have more time to think about the consequence of their action in the employment of a counter-strike capability.
Dr. Williams: Thirty seconds.
Ms. Mineiro: And then the last one is that the AI tools can actually be used to model, to war game, to exercise, to force plan, to optimize against force structure, and to think about how we would actually employ a nuclear weapon. Quite frankly, most of the investment in supercomputers comes from NNSA precisely to do this, right? They have invested heavily in supercomputing to model the effects of nuclear weapons. So that is my argument. I think we need everything, every tool that American innovation can give us to preserve our security. And I think AI in NC3 is an appropriate use.
Dr. Williams: Paul, please.
Paul Scharre: Thank you.
Ms. Mineiro: This is the most awkward part. (Laughter.)
Dr. Scharre: Great. OK. Excuse me. All right. Well, thank you, Sarah, for those wonderful opening comments. And thank you, Heather and Chris, for this, and to CSIS for hosting this discussion. And thank you all for coming and joining online.
Should we take the most dangerous weapons humanity has ever built, which have the potential to kill millions, and integrate into their command and control a completely unreliable technology that we do not understand? No. We should not do that. (Laughs.) In fact, it’s such a bad idea that the idea of integrating AI into command and control has been fictionalized as one of the dumbest things humanity could ever do. So Skynet, the AI villain that attempts to wipe out humanity in the Terminator series, has become a cultural shorthand for how crazy of an idea it is to integrate AI into nuclear weapons command and control. And make no mistake, that’s what we would be doing.
Even if we keep a human in the loop but integrate AI into nuclear decision-making, we would be ceding judgment to AI. Humans were in the loop in 2003 when the highly automated Patriot Air and Missile Defense System shot down two friendly aircraft. Now the human operators were nominally in control, but in practice they were operating a very complex, highly automated system that they did not understand, in a complicated, real-world environment that had novel challenges. And the automation failed. They did not understand it. And the humans were not in control. The machine was. And that allowed the system to shoot down two friendly aircraft. And now imagine if those were nuclear missiles, and the consequence of that mistake was nuclear war. That’s the stakes we’re talking about.
So we can’t only think about what might be theoretically possible if everything were to work perfectly. We need to acknowledge the reality of military operations under friction and the fog of war. The U.S. military is not immune from mistakes, and neither is the U.S. nuclear enterprise. The list of mistakes and near mishaps is terrifying. In fact, there have been literally dozens of nuclear mishaps and near misses in U.S. history, and nor are all of these in our distant past. In 2007, the Air Force left six nuclear missiles unattended for 36 hours, during which they were flown across the country from North Dakota to Louisiana, and no one noticed they were missing. (Laughs.)
Right, so the idea that the U.S. military is going to successfully integrate a technology that is completely unreliable into the most high-consequence mission, and that it will increase resilience, is a total fantasy. But it’s even worse than that, because even if by some miracle the U.S. military were to integrate AI into nuclear decision-making in a way that appears to be reliable in peace time, we have every reason to believe that it would fail when we need it the most, in a crisis or in wartime. Because AI systems are terrible at adapting to novelty. If they are presented with a situation that is outside of the scope of their training data, they are effectively blind and unable to adapt.
So what is the training data set that we would use for nuclear war? Thankfully, we don’t have one. But that means that when our systems are most needed in a wartime situation or a crisis or a threat of war, they would be operating outside the parameters of their training data and could not be trusted. And of course, the real risk is that an AI system might appear to be reliable in peacetime, which could lead war fighters to trust it, and then it fails terribly in war or crisis. And that brittleness is a unique feature of AI. Humans can flexibly respond to novel conditions, and AI can’t. Yet, humans can be deceived into believing that AI is – because it’s capable in one area, is capable in other related areas, even if it’s not and it can fall apart very dramatically.
Humans can over-trust AI with catastrophic consequences. And we’ve seen this in other areas like in self-driving cars, which drive very well in some settings but then suddenly and without warning have driven into concrete barriers, parked cars, fire truck, semi-trailer, pedestrians, causing fatal accidents, because humans tend to then over-trust the automation because it’s good in one area and they assume it’s capable in others too. And if that situation is not in their training data, the AI doesn’t know how to react. It can’t adapt.
AI systems also don’t understand context, which can be critical for making the right decision in a crisis. So when Stanislav Petrov, sitting out in a bunker outside Moscow in 1983, received an alert that five nuclear missiles were inbound from the United States his thought was, why only five? That didn’t make sense. If the U.S. were to launch a surprise attack they should send an overwhelming number of missiles. So he had context that was important to add to the information he was receiving. He had another important piece of context. He knew that the Soviet Union had recently deployed a new satellite early warning system. And he knew that new technologies often don’t work as advertised, and they break, and the system might be malfunctioning. And so he reported to his superiors that the system was malfunctioning.
And what would an AI do in that situation? Whatever it was trained to do. It wouldn’t know enough to understand the broader context to inform its decision-making. And it certainly wouldn’t know the stakes. And so, in fact, a nuclear crisis is exactly the kind of situation where we would expect an AI system to fail catastrophically. What is a novel event and where context is very important to making the right decision.
But that’s not all. That’s just how AI systems can fail. That doesn’t even get into all of that nasty ways that AI systems can be hacked and manipulated, which our adversaries will absolutely try to do. AI systems can be fooled with clever spoofing attacks. The training data can be poisoned, creating hidden backdoors that can later be exploited by adversaries. This poisoning can be done in such a way that the system looks like it’s functioning normally and that you can’t detect that it’s malfunctioning.
Now, AI is getting better over time, but that sometimes introduces new problems. The most advanced AI systems have been shown, under some conditions, to engage in spontaneous strategic deception – lying to their own users – to accomplish the goal the system has been given. And we don’t know how to make them reliably stop doing it. So I could go on all day, well past my time limit, about the failures of AI systems – and there are many. The point is not that AI is not good for anything. It’s great for lots of applications, including in the military. But it is nowhere near reliable enough for applications that require zero tolerance for failures. And that’s what we need in nuclear command and control. AI is a long, long way from the level of assurance that we need in the nuclear enterprise.
And lastly, of course, an AI system will never be able to understand the emotional gravity of what’s at stake with nuclear assurance, where quite literally the fate of humanity hangs in the balance. Stanislav Petrov knew the stakes. He knew that if he got that wrong millions of people could die. And an AI will never understand what that means. It will never feel sick to its stomach about the consequences of getting it wrong. So the answer’s a clear no. Integrating AI into nuclear decision-making will not enhance resilience. It will degrade our decision-making, make the risk of inadvertent escalation more likely, and undermine nuclear stability. And even worse, it could lull us into a false sense of security and confidence by appearing to be reliable in peacetime and then failing when we need it the most. So thanks so much and I look forward to the discussion. Thank you.
Dr. Williams: Thank you. Well, thank you both so much for really stimulating opening remarks. And thank goodness you disagreed with each other. It would have been a really boring debate otherwise. Paul – we’re now going to turn to rebuttals. And Paul has the tricky task of having just given remarks and now also has to give a rebuttal to what Sarah had said. And then Sarah will give a rebuttal to Paul.
But just to kind of quickly summarize how I am understanding the debate thus far, and we can dive into this as much as we like, Sarah’s argument: AI tools and technology should be applied across NC3, with the exception of automatic release. That’s based on points that AI is already used throughout NC3 in hardware and software. It can expand the decision space, especially when time is short. And it can help with planning, modeling, and exercising. Paul’s argument is that this is an unreliable technology that shouldn’t be applied to the most dangerous of weapons, and that argument is based on some historical examples of automation failures, humans not always being in control when they thought they were, nuclear mishaps, and the challenges of adapting and lack of training data.
And so what I’m hearing is the main points of disagreement, first is about the reliability of AI itself, the technology, but also its susceptibility to hacking and spoofing. But then also on the risks of AI, specifically in the nuclear context, because the consequences are so high, because these weapons are unique in that way. So hopefully I have captured the main crux of these issues. And now we’ll let you two just duke it out with each other a bit more. (Laughter.) So, Paul, I’ll invite you to take four minutes and respond to anything that you heard Sarah say.
Dr. Scharre: OK. Great. Thanks. All right. So, yeah, awkwardly now with the format you have to listen to me for a few more minutes. But thank you, Sarah, for your great comments. And I will try to address some of the things that you raised.
I think – you know, I really like that you brought up the always/never dilemma, because I think that this is crucial to the challenge of integrating AI into nuclear decision-making. It’s a – that’s a tall order. That’s a tall order for people, for our organizations today, to say, OK, we always want nuclear command and control to work to convey an authorized order from the president to employ nuclear weapons. But we never want a situation where there is an accidental or unauthorized use.
And there’s just no way that AI is good enough to meet that criteria. There’s no AI system that’s going to be capable enough to say this is always going to give the right answer. Maybe 80 percent of the time, maybe 90 percent of the time. In a lot of domains, that’s going to be fine, right? And we can look at human performance and say, OK, is it a little bit better than a human radiologist in , you know, looking at this X-ray and getting the right answer? But that’s not the case in the nuclear enterprise.
You brought up a couple specific examples of ways to use AI that I want to tackle some of these. AI in designing chips, for example. I mean, we’re doing that now. That’s a – that’s a sensible thing to do. I think that there are – I would distinguish between things that are involved in decision-making and things that are in maybe the broader nuclear enterprise more general, or adjacent to it. So, you know, we shouldn’t be Luddites and say, OK, we’re going to use some archaic way to design chips, that we’re not going to use the best way to do it. OK, that makes sense. That’s a sensible way to use AI.
Should we have autopilots on planes? Yes, we should have autopilots on planes, right? (Laughs.) AI is able to do that better. But I think there’s a really key difference between things that are simple, repeatable tasks, where we have good data on what that task is – like taking off and landing planes. We can test that in peace time. That’s not different in war time. You’re taking off and landing the same way. Versus decision-making about using nuclear weapons, or even information that feeds into the decision-making that might be different in a crisis. And I think that’s really important.
You mentioned that one of the goals of AI is that we could buy time for decision-makers. That’s an admirable goal. Anything we can do to buy time is great. I think we should try to do that. I think the challenge with AI is if you start to bring it into the decision-making process – let’s say that there’s some recommendation brought up to decision-makers that says, OK, our algorithm predicts X. The goal would be to buy time, right? We give a little bit of earlier warning. The challenge is, do we, in in the way of doing that, add more uncertainty, right? There’s a question about just, is it right? That’s a really important question.
But also, do policymakers trust the AI system? Or are we simply adding more noise into the mix, right? In an already chaotic environment adding more uncertainty for decision-makers of, I don’t know if I can trust this information? I don’t know if I can trust this recommendation from an algorithm. And is that actually helping get to better decision-making? Or is that simply creating more confusion for human decision-makers? Because I think at the end of the day we’ve got to sort of acknowledge the reality of human cognition too, and human psychology, and how do people respond to these systems.
That’s been an important issue with self-driving cars, right? That sometimes the way people treat AI is, we’re going to have the AI do everything it can and then we’re going to have the human just fill in the gaps.
Dr. Williams: Fifteen seconds.
Dr. Scharre: And I think what we’ve seen with things like self-driving cars is, OK, the idea that you’re going to be driving down the road at 70 miles an hour and the AI is – the automation is doing fine, and then in a split second the human is going to recognize something’s wrong, and intervene, and take over, is not realistic. It’s not how humans function. And so we’ve got to think about how are humans going to interface with these systems, and how can we support humans to make them more effective. So thank you.
Dr. Williams: Thank you.
Ms. Mineiro: Thank you. I think that’s great. I think what’s really interesting here is a couple of things. One, thankfully, there is a real lack of experience and lived training data for a nuclear weapons employment scenario. And I hope that that continues to be the case. I think one of the interesting things here – and there are certainly a lot of risks with biases in training data. I mean, that is part of what the executive order that President Trump signed out yesterday was trying to address. Other administrations have certainly put out robust kind of EOs about – and the Department of Defense has also done strategies about the dangers of implementing AI throughout entire weapons systems. So I cede that point.
I think the challenge here that I heard that was kind of brought there is I don’t believe that AI is appropriate to obviate the need for human judgment. I think AI tools and techniques are exactly that. They are tools and techniques that can be applied when you are analyzing petabytes of data that are going to impact millions of people’s lives. And if you can crunch and do that pattern recognition, classification, and flight path determination any quicker, then what you have to do is you should absolutely do that left of launch. You should make sure that all of that is as robustly kind of vetted as possible.
And then the scariest thing about nuclear weapons, at the end of the day, that is still a human judgment, right? And there is no recourse in my mind for AI to replace human judgment. So when I say I’m kind of for AI, quite frankly, all the way up until you get to automated weapons release authority, for me that also includes, like, co-generation. I don’t think AI is going to be doing co-generation on what our response is for nuclear weapons. It is not a replacement for actual rigorous planning.
There is always risk, right? As somebody who’s a mathematician and space nerd and all this, right, like there’s always risk as you approach zero. The always/never scenario is – it’s not a scenario. It’s a requirement. It’s very hard. I have very large tattoos, right? And I used to walk into the Pentagon. And people would not know what to do with me. (Laughter.) And they were, like, what are you going to say about our nuclear weapons, you know? And I would be, like, look at me. You know, I was younger, I was cooler, I had big old tattoos. And I’d be like, I’m a relatively risk-tolerant person. (Laughter.)
The one area where I will never choose to accept risk is nuclear command and control, because it is nuclear command and control. It is literally a discussion about existential humanity. So I think I find myself in the awkward position now, being out of the Pentagon, of being much more optimistic about the Pentagon’s ability to not only integrate innovation that we’re making here in America that can help our economy and our national security, but doing it also in a responsible way that ensures the nuclear surety requirements of the United States and its allies.
Dr. Williams: You’re at time. Fifteen seconds.
Ms. Mineiro: Bam.
Dr. Scharre: Perfect.
Dr. Williams: It was perfect. Great. Thank you both for respecting the time limits. Thank you for teasing out your arguments and engaging with each other.
We’ll now turn it over to Chris, one of our Mid-Career Cadre folks, who is going to pose a few questions. So you each get one question and then you’ll have two minutes to answer. So, Chris, thank you for doing this. Over to you.
Chris Andrews: Yeah, thanks, Heather. Always a delight and privilege to be able to speak here. I do have to legally say I’m not asking these questions on behalf of my employer, the DOD, NDU, or anybody else. They’re coming straight from the heart. (Laughter.)
Paul, a couple of data points stand out to me in your excellent remarks. First and foremost, I can join you in celebrating that there’s a very limited data set for real-world applications of weapons release authority. That’s great. The second set of data points that stand out to me are the myriad accidents that you mentioned in the beginning, literally dozens of near misses. The loss of control of nuclear weapons certainly is alarming, and the history is far more terrifying than we have time to get into right now.
It seems to me like those examples highlight the need to potentially augment human judgment or support it in some way. It would have been a nice reminder, for example, if there had been a good automated alert that said: Did you make sure that you kept track of the nuclear weapons today before taking off? (Laughter.) So it seems to me like there’s an opportunity to actually mitigate risk and augment what already is a history of flawed human judgment that, frankly, we’re lucky to have escaped from. So my question to you is, what would have to change about AI’s reliability, its performance, its integration into different components in NC3 that would change your point of view and make you more excited about it, and think: This is a system that can help us reduce risk?
Dr. Scharre: Yeah. No, that’s a great question. I don’t – to be clear, like, I don’t want to say we should never use technology to improve, you know, the safety of our nuclear umbrella. Of course we should, right? And simple things like automation, like a check for somebody saying did you double check the whatever, like, that could be helpful, right?
I think the main risk is that when we start to see more sophisticated AI systems – and that could be using neural networks, using deep learning, or even very complicated rule-based systems, you can get to a point where the human operators don’t understand them very well, the inner workings of the system, and they’re complex enough. And when they’re dealing with a very complex, real-world environment they’re going to get edge cases where the AI system fails, and you’re going to get accidents. And so, you know, could we – if there was a way to improve nuclear assurance – because the history is terrifying – should we find ways to do it? Yes, absolutely.
But I would contrast the nuclear problem with the related problem, which is driving, right? We’re in the process developing self-driving cars. They will be better than humans. One could argue, based on some of the data, that actually we’re there now. But the baseline there is humans are terrible drivers. (Laughter.) Like, 30,000 people die on the roads every year in America.
Mr. Andrews: Yes, we’re in the DMV. We’re –
Dr. Scharre: Yeah, I mean, I just drove into work this morning. It was a harrowing experience, right? (Laughter.) And so, like, the baseline is bad. But also, in order to get there – and this is a really key difference with the challenge we’re talking about in the nuclear space – we put self-driving cars out on the roads. They’re in a real-world operating environment which is the same as the one that they will operate in, which we don’t have in the nuclear space. And there have been accidents. And I don’t know that we could ever get to the place where you have good self-driving cars that are safe without having some accidents along the way. And I don’t think the nuclear command and control enterprises in space where we can tolerate those kinds of accidents.
Mr. Andrews: Yeah, that’s certainly an excellent point, I think. I’m going to turn now to Sarah. I find it generally persuasive that, as Paul agreed, it’s important to incorporate new technology to make systems more efficient. I think there’s a big difference between the character of AI versus just getting more efficient computing to analyze data more effectively, more efficiently, and expand that decision space. So my question then becomes, all right, once we’ve improved our capability to analyze complicated data, once we’ve made the decision to lean into AI as part of our NC3 apparatus, decision-making authorities aside, how might our adversaries perceive the risk space, given that they know that we’re leaning into that?
And I asked that question keeping in mind recent statements by the President and Xi Jinping to agree to keep a human in the loop, so to speak. It still seems to me like there might be an opportunity for our adversaries to say, look, we’re losing ground in the AI race in NC3. I think there’s certainly a sense of general competition in AI more broadly, outside of NC3. So, Sarah, how might our adversaries assess that risk space if the U.S. continues to incorporate AI into NC3? And how might that affect strategic stability more generally?
Ms. Mineiro: Yeah. This is a great question. And this is exactly why I think this debate is really prescient right now, right, because, you know, at the end of the day – and Paul articulated this as well – AI isn’t emotional. It doesn’t understand context. It certainly doesn’t understand how to be flexible. And then, when you end up applying that to how you would anticipate adversarial leadership decision-making, right, and you mirror that on what adversaries think our decision-making process could be, there’s room there for real consternation, right?
Mr. Andrews: Certainly.
Ms. Mineiro: It is no – there’s no surprise that the Russians and the Chinese, particularly, have invested heavily in AI. It’s no surprise that they have different thresholds for use of strategic systems and how that would impact a nuclear weapons scenario, an employment scenario. At the end of the day, I am less – I feel like it is less compelling to use AI for kind of that rote, generative, kind of co-development in the nuclear command and control perspective. And I think it’s much more on kind of the data crunching and allowing computers to do what computers do best, while humans do what humans do best.
At the end of the day, this ends up being an optimization question, right? And the reality is that this isn’t a binary optimization. This is a multifactor optimization. And so clearly the enemy – or, the adversary always gets a vote there. And they have spoken not only in their resources and how they’ve decided to develop these things, but also in their writing about how they would choose to employ them in a strategic way.
Mr. Andrews: Outstanding. Thank you, Sarah. Thank you, Paul.
Dr. Williams: Thank you both. And thank you, Chris, for really great questions to get us started. So the questions are coming in. Again, if you’re in the room please use the QR code. If you’re online, please use the online form.
We’re going to start with two questions for Sarah that were coming in. The first one comes from one of our nuclear scholars, Christina Prah from Oak Ridge Lab. How can we verify the objectivity of AI, knowing that the datasets were input by humans and that they can be skewed towards specific directions to suit the desires of potentially malicious persons? And anonymous asked you a question. Anonymous asked: How would we be able to sufficiently counter automation bias when asked to make decisions under uncertainty and with time pressures? So kind of related questions there about some potential limitations.
Ms. Mineiro: Yeah. Yeah. Those are both great questions. And they are related. And so I think, you know, at the end of the day I think everyone on both sides of this debate recognizes that there is limited training data. It is one of the most challenging endeavors right now, is to figure out how to un-bias training data in algorithms. And that is something that, I mean, ironically, places like Oak Ridge are really looking at. But it is also incumbent upon the department to really look at and make available what they believe that training set looks like, what that data set looks like. I mean, there’s entire offices now that, quite frankly, aren’t just looking at the ethical use of AI, but are also looking at the human interaction with that, how humans perceive AI, what that looks like.
And, you know, quite frankly, more resources and more study needs to be put to that. I don’t have an easy answer for that. I will tell you, I don’t think most people do. But the training data and set, I mean it – I think one of the challenges with this kind of a prompt, or this kind of a debate, is that definitionally everybody just thinks, like, AI. We all have – like, there are, what, like, how many people are in the room right now?
Dr. Williams: Little over a hundred.
Ms. Mineiro: There’s, like, a hundred people in this room. There’s probably at least 200 different definitions of what AI ought to be now. (Laughter.) And when you apply that against, right, NC3, we’re all thinking about this and contextualizing it in a different way. For me, AI and machine learning is really about pattern deviation classification and really combing through petabytes of data, and then allowing a human to be a human. And I can agree or not agree with a human. It is what it is. I mean, humans are inherently flawed. It is the nature and the beauty of being human. But that is different, for me, than co-generation, than generative AI, then – you know, which I’m sure a lot of people also are abstracting into as well.
Dr. Williams: Mmm hmm. Thank you.
Question for Paul. This is also coming from one of our nuclear scholars from this year, from Adam Reynolds. You might have to do a little explaining on this one as well. Would advances in the explainable AI potentially change your opposition to AI and NC3? In a scenario with XAI, a human in the loop could consider the explanation of the AI’s decision in his or her decision to concur with the AI’s recommendation.
Dr. Scharre: OK. Yeah, maybe a little background. A level-set, right.
Dr. Williams: You’re going to have – you’re going to have to explain explainable. (Laughter.)
Dr. Scharre: Yeah. (Laughs.) Explain explainable. So, I mean, I think one of the fundamental problems with AI systems, say, particularly for ones that use deep neural networks – so for deep learning systems – is that their answers are often not explainable, in the sense that after the fact we have a hard time understanding why did the AI system do that? That is really fundamentally different from traditional rule-based systems. So rule-based software, the kind that we have on our computers, that’s in an airplane autopilot – sometimes those are not predictable about how they will respond in specific situations. So you get something like the 737 MAX crashes. But after the fact, they’re explainable. We can go back and figure out, like, what went wrong? You can look at the lines of code and what the inputs are and figure that out.
Now, a neural network is this massive connection of trillions of artificial neurons. And so if you look at something like the output from ChatGPT, and you say, like, why did it say that? There’s not actually a great answer for that. The answer is buried in this trillions of connections. And so sometimes you get, like, really weird situations. So, for example, Anthropic’s Claude has this sort of affinity for animal rights. And it wasn’t, like, specifically designed to do this. It’s very pro-animal rights. The designers don’t totally know why, and don’t know how to, like, sort of level that out in a way that doesn’t then cause adverse effects across the systems. And so that’s a real problem for using it, like, literally, in any kind of high-consequence application.
Now, the goal of explainable AI as a field of study is an AI that you can look into and you can get some better sense of what it’s doing. Ideally, you can peer directly into the neural network and you could say, OK, this set of neurons is causing this activation, and that’s why we’re getting some kind of output. That’s different, of course, than an AI that explains itself, which systems can do today. The AI language model, for example, could tell you why it did something. But that’s maybe not the same as why it did it. Just like with people, people could give you an explanation for why they did something but that may not be actually why they did it, right? They may actually not know themselves why they did – you know, act a certain way.
And so I think that if you could get to truly explainable AI that was understandable, where we really had a good sense of why the model was functioning a certain way, and ability to adjust expectations, that would be absolute gamechanger. But we’re a long ways from that right now.
Dr. Williams: We’re getting a lot of questions coming in about accountability, responsibility, governance. And so I’m going to have a question for each of you on this topic. The first one – this is for Sarah. This comes from Adam Hawather: How would you address the responsibility gap AI poses in the case of failure or inadequacy? Who should be held accountable?
And the question for Paul, I’m going to – I’m going to bundle a couple here. But given the potential unpredictability of AI systems when they’re under stress, what safeguards should be prioritized? And then I would also add onto this, you know, in your – one of your books, I think it was in “Army of None,” you capture, kind of the history of governance and arms control. I still remember even an appendix with a really fantastic historical table of arms control that I love to cite. But you look at different tools of governance and how these could be applied. Do you – are your views the same on AI governance, since you wrote that book? Or has anything changed with technological developments?
But, Sarah, we’ll go to you first, if you want to take Adam’s question.
Ms. Mineiro: So Adam’s question was about accountability?
Dr. Williams: Yeah, how would you address the responsibility gap that AI poses in the case of failure or inadequacy? Who should be held accountable?
Ms. Mineiro: Again, since my argument is that we should not use AI to automate weapons release authority, and we should be using AI to enhance or augment human judgment, then the human judgment of the consequence, right, that the human needs to be accountable. I mean, AI is not a replacement for human judgment. AI is not a replacement for – I mean, it is artificial intelligence. You know what it’s not a replacement for, that I am constantly looking for? Real intelligence – (laughter) – genuine intelligence. It’s real hard to find. (Laughter.)
And so, you know, it’s a glib answer, but it is the correct answer. Which is to say, these are tools and techniques. They are not means and ends of themselves. And while we’re talking about AI, there are plenty of other tools and techniques that we can use to optimize human judgment that we should be applying across all of our defense missions. And so that gap in responsibility or accountability essentially always comes back to an empowered individual who’s making a decision, which is the root of, you know, what defense and national security should be for people.
Dr. Williams: A quick follow-up to that.
Ms. Mineiro: Yeah.
Dr. Williams: Would we always know who that person was?
Ms. Mineiro: For nuclear employment, you always know who your command authority is, right?
Dr. Williams: But, for example, some of the inputs going into – like, where the information is coming from. Or if you are relying on AI, for example, for situational awareness, some of the things that you outlined, would you – and if you did have AI contributing to that decision-making process and something went wrong, one way or the other, would you always be able to trace back who is ultimately responsible?
Ms. Mineiro: No. You would not.
Dr. Williams: OK. Thanks.
Paul.
Dr. Scharre: Yeah. So, to the first question about safeguards – what safeguards should we have in place? I’d say, in general, when we think about implementing AI in military operations, I would think about bounding the risk. So think about, OK, if this AI or automated system fails, what are the consequences of doing that? And then what can I do to sort of bound that risk in a way to put other types of guardrails in place, right?
So if a conventional munition, or it’s a drone that has some kind of automation on it, like, what is the payload of this system? Like, how – if it strikes the wrong target, how big are the explosives? How many weapons are we talking about here? What’s the range? Sort of, what’s the consequences? And are there other things that may not be in the AI space specifically that we can do to sort of bound that risk? And then think about sort of, like, OK, if something goes wrong, how bad can it go before there’s time for a human to make an informed decision to intervene to change things?
In the nuclear space specifically, you know, I think the – I think the statement in the U.S. Nuclear Posture Review that states that human will always be in the loop is excellent. Very glad to see that. I think it’s a huge diplomatic win that China agreed – that Xi agreed with Biden earlier this fall to maintain human control over nuclear weapons. I think the next step really is to figure out, like, what does that mean in practice? Like to get actually down to some of the nitty-gritty of what we’re talking about here of, OK, how does that cascade forward into guidance internally about kind of what the left and right limits are about ways that AI could be employed?
And in terms of what’s changed in AI governance, so “Army of None” has been out a couple years now. And the technology is moving very, very quickly. (Laughs.) So I think the – a lot of the underlying ideas about, say, the legal and moral and ethical issues surrounding autonomous weapons are not different. The technology has, of course, changed remarkably in just a couple years. The book talks about deep learning and neural networks, but it’s much more advanced since then.
And I think one of the things that we’ve seen is as the technology gets more capable, you know, for a long time I think people have assumed that as AI is getting more advanced it’s heading towards human-level intelligence. This is pervasive now in discussions about, are we on the verge of, like, AGI. What people often mean is human-level intelligence. And I think implicitly people envision human-like intelligence. In fact, I think what we’re seeing is the AI that we’re building does not think like humans.
It’s different. It thinks in very strange and alien ways. And that is a new development. The technology is powerful, but that’s one that we need to be mindful of that even if the outputs sort of look like humans, because that’s what it’s designed to do, what’s going on under the hood is not. And we need to be conscious of that when we’re using the technology.
Dr. Williams: We’re running short on time, but there were a lot of questions about adversaries’ use of AI. And so I’m going to pull just one. It’s the most succinct one. I’m going to have to ask you to be a bit brief in your response, if you can. This comes from Dave Winkie, who’s at USA Today. The question was: Does China’s looming adoption of at least a partial launch on warning posture raise the mutual stakes of AI integration into NC3? Whoever wants to take that one. (Laughs.)
Ms. Mineiro: All right. Give it to me again.
Dr. Williams: (Laughs.) Does China’s looming adoption of at least a partial launch on warning posture raise the mutual stakes of AI integration? So if you are further repressing the decision-making space in a potentially protracted regional crisis with China, would the – would incorporation of AI be even more problematic?
Ms. Mineiro: I mean, I think I’d challenge the premise of the question, because at the end of the day I think everyone has agreed that there does need to be a human on the loop making that decision. The AI integration that I would foresee is very left of launch. And so I don’t actually – and, at the end of the day, there is no risk tolerance on NC3 anyway. So increasing – the discussion about increasing – does it increase the risk, the risk is already there. It is already acknowledged. It is – you know, that requirement for always/never for us is always going to be there. That will be enduring, across administrations, across technologies. It will exist. So –
Dr. Williams: Anything to add?
Dr. Scharre: I mean, I guess I would maybe broaden the question a little bit to say that, in my mind, there are a couple things that are raising the stakes of getting it right in terms of nuclear stability. And Chris kind of mentioned this earlier, but, you know, the increased salience of nuclear weapons geopolitically, with things like nuclear saber rattling we’re seeing Putin doing, China’s nuclear modernization is taking us towards a tripolar nuclear world.
And then the introduction of emerging technologies – AI, cyber, space, ISR – that’s then changing the nuclear balance. And so I think that the stakes of getting this right, of integrating these technologies in a way that is stabilizing and not destabilizing, is critical in a way that – the landscape is changing in a way that it really hasn’t for decades. And so having folks like, frankly, all of you, invested in all of this, we have a lot of brains to figure out what’s the right way to integrate these technologies in a way that is stabilizing, is absolutely essential.
Dr. Williams: So, with that, I want to give you each a chance to offer – I’m going to take away 30 seconds from each of you from what we said. Sorry. You get 60 seconds each for any final reflections that you haven’t snuck in already, or points that you want to make.
Sarah, we’ll start with you.
Ms. Mineiro: Yeah. I mean, I think – again, I will assert that AI tools and techniques are, one, tools and techniques that need to be applied with critical judgment by humans consistently across any C2 system. Two, they are already being used, both in the hardware and the software aspects of NC3 systems. Three, I think that this is a multifactor optimization challenge that definitionally is challenging, but also it is not the case that the same AI tools and techniques need to be used across the entire spectrum of that kill chain and that engagement scenario.
So you’ll have different AI ML techniques that are used earlier versus later. I think later, when you get into the actual kind of counter strike or employment decision-making, AI becomes less effective and less appropriate. Certainly, on the side of automation I think it is heavily weighted earlier in use in planning, in characterization, and identification, so that it’s more process intensive up front. So that’s my general argument.
Dr. Scharre: OK, awesome. Well, first of all, thank you, Heather and Chris and Sarah, for a great discussion and a lively debate.
You know, Sarah, you made this point earlier, that you said we need to make sure that we have machines doing what machines do best and humans doing what humans do best. And I couldn’t agree more. I think that’s exactly right. The question is, what are those tasks? And machines are really good at things where we have lots of data and we have clear metrics for better performance, we could all agree that more of something is better. So driving is a good example. We can collect data out in the real world. And we all agree that having fewer accidents is better. That’s what we want. We want something that’s safer.
So humans are better at places where we don’t have good data, where there’s a lot of novelty, and the right answer depends a lot on context. And what does that look like in the nuclear enterprise? There are certainly some applications where AI and automation makes sense, broadly conceived in nuclear operations. Automated takeoff and landing for aircraft. Clear right and wrong answer there. And we can make machines now, right – for a while now – that are better than humans at that. Are there roles for computer vision? For tracking adversary movements and identifying, OK, this is a mobile missile launcher, this is where it is? Absolutely. That’s a sensible application. We could say, this is a mobile missile launcher. This is what we know what it is.
But let’s take the next step then and to say, OK, could we have a predictive AI system that could give indications and warning of a surprise attack? That would be a terrible application, because we don’t have the data. What may be a good answer would depend highly on other pieces of context. And so we need to be very cautious as we’re thinking about how we use this technology, and mindful of its failures. Thanks so much.
Dr. Williams: Thank you all so much. The final thought that I’m going to leave you all with is quite a few of us have mentioned this statement between Biden and Xi about human in the loop. And, you know, I think we’ve said a bit – I think we’ve actually gotten a lot into the details of what that means in practice and what that looks like. And so I would just put that out there as something to watch, in particular as yesterday we heard that President Putin and President Trump are interested in restarting arms-control talks and trying to reengage in some sort of dialogue, hopefully on risk reduction measures. And historically, Russia has been reluctant to sign up to a human-in-the-loop statement. And so I would put this on the table as something that if those arms-control talks do go forward, this might be a topic for discussion.
And so I think it made today’s debate all the more timely, that hopefully everyone has a bit of a better understanding of the nuance, the technology. We ended with a bit more agreement than I think we had started with, which is nice. It’s nice to come around there. But I think that there’s a much bigger trend, which doesn’t just apply to policy but also probably applies to everybody’s day-to-day life, which is, what is – which is the human relationship with technology. And how much do you trust it? And how much do you rely on it? And when and where? But in the context of nuclear weapons, when the consequences are so grave and the scale is so big, it raises a lot of bigger ethical questions as well.
And so I’m just really grateful to the three of you for participating in this discussion. A big thank you to the CSIS AV team, as always, for helping us to pull this off. Biggest thanks of all go to Diya Ashtakala on the PONI team for organizing and coordinating everybody. And also to Caroline Ward, who just walked in. Thanks, Caroline. And lastly, but not least, is thank you to you all so much in the room and online. I’m sorry that we could not get to all the questions. There were a lot of questions. This generated a lot of interest. And so I apologize if we didn’t get to your question. But please do stay engaged with this issue. Stay engaged with PONI. And we look forward to seeing you at the next PONI event. Thanks, everybody. That’s it. (Applause.)
(END.)