UK AI Summit Debrief Part 1: The Framing of AI Risk
The November 2 conversation between British prime minister Rishi Sunak and Elon Musk, ostensibly a distinct event from the UK AI Safety Summit, was a representative conclusion. Like the summit at Bletchley Park, the Musk-Sunak meeting elevated long-term artificial intelligence (AI) risks to the forefront of the AI safety discussion while also generating its fair share of controversy. In the end, perhaps unlike the Musk-Sunak conversation, the summit was a substantive contribution to the development of global AI governance.
The UK government was precise in defining the scope of the summit as focusing on safety of foundation models—broad AI tools that can adapt to a variety of applications—at the research frontier, with a broad group of countries from each region and income group. UK officials referenced the need for that specific conversation in the context of existing international AI dialogues, such as the G7 Hiroshima Process, Organization for Economic Cooperation and Development (OECD) AI Principles, and the Global Partnership on AI (GPAI), which either primarily focus on other aspects of AI governance or include advanced economies. The countries included at the summit that are not represented in either the G7, OECD or GPAI were China, Indonesia, Kenya, Saudi Arabia, Nigeria, Philippines, Rwanda, Ukraine, and United Arab Emirates.
The four primary risk categories discussed were (1) AI misuse, primarily among non-state actors, (2) unexpected capabilities at the AI frontier, (3) “rogue AI” and superalignment, (4) AI safety and societal diffusion.
This commentary summarizes the summit discussion around each risk category and sets it in a broader policy context of other political, technological, and diplomatic developments.
AI Misuse, Primarily Among Non-state Actors
Frontier AI misuse, set within the context of global risk and governance, pertains to the inappropriate or dangerous application of advanced AI technologies that pose significant threats to global security and stability. This encompasses the potential for cyberattacks, manipulation of information, and other harmful uses that can undermine international safety protocols, biosecurity, and cybersecurity.
Biosecurity in particular stands out as being an area of common focus in international governance. The Biden administration’s Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, announced in the days preceding the summit, is a broad, sprawling document that reserves specificity for a small number of issues. Biosecurity is one of those exceptions. Models that are trained or deployed primarily on biological sequence data have a lower computing power threshold to trigger mandatory reporting than all other categories of AI models. Researchers at RAND Corporation are in the midst of a study on the real-world impact of large language models on bioweapons engineering, comparing model capabilities with information already available on the internet through conventional search.
Of the countries represented at Bletchley Park, all are parties to the Biological Weapons Convention, which “prohibits the development, production, acquisition, transfer, stockpiling and use of biological and toxin weapons.” Consensus on safety measures that preclude the development of dangerous biological agents should be reasonably easy to reach.
In contrast, the discussion around cybersecurity seems to be limited to the threat from non-state actors. That focus contrasts markedly with the policy discussion on AI cybersecurity in Washington. China is a major concern for many AI cybersecurity stakeholders in the United States, from protecting AI labs’ intellectual property to AI-driven industrial espionage campaigns. Relatedly, there are recent indications that China’s government is leveraging generative AI to reinforce its disinformation campaigns targeted at the United States. Having made a major effort to transcend strategic competition with the invitation of China to the dialogue, it would be incongruous for the conversation to descend into an argument between rivals. Regardless, the congeniality of the summit may soon give way to the reality that, on this particular issue, the interests of the international community diverge dramatically.
Unexpected Capabilities at the AI Frontier
The technological frontier is defined by rapid and often unforeseen developments in AI technologies, where the abilities of AI systems exceed expectations and previously established predictions. These advances are characterized by their potential to bring significant benefits in various fields, such as health, education, and environmental science, while simultaneously posing considerable risks due to their unforeseen nature and potential for misuse. The unpredictability stems from the rapid scaling of AI models, their ability to connect with other systems, and the sheer number of possible permutations of their applications, making it challenging to anticipate all potential outcomes and implications before their deployment.
Unpredictable development is a difficult thing to measure. Consider the prospect of an AI superintelligence, which Oxford philosopher Nick Bostrom defines as possessing “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.” There is no consensus in the technical AI community on how many years it will take to develop superintelligence. Some believe the current pace of development in the AI sector has made it possible that superintelligence arrives in the 2020s. Others believe large language models are fundamentally flawed tools that are incapable of genuinely surpassing human intelligence.
While that wide range of possible outcomes induces some concern, the delegates at Bletchley Park appear to be relatively sanguine about the possibility of managing that risk through testing and evaluation and greater visibility for public stakeholders in risk management. There is also a general skepticism of open access models, which are associated with greater transparency and research diffusion but are impossible to withdraw once a dangerous capability is discovered. That skepticism seems almost entirely aimed at Meta’s Llama 2, which ironically is being demonized for its insistence on giving away its intellectual property for free in the name of AI diffusion.
“Rogue AI” and Superalignment
There is an ongoing debate about the theoretical risk posed from “rogue AI,” represented by a scenario in which highly advanced AI systems operate beyond human anticipation or intended oversight, making decisions or taking actions that their developers did not foresee or intend. This concern arises from the rapid advancement of AI capabilities, where future models may significantly improve in cognitive abilities, decisionmaking, and interaction with the physical world, and—perhaps as a direct result of that evolution—autonomously develop goals or sub-goals that are misaligned with human instruction.
Losing control over frontier AI is a prominent misalignment scenario. In a July 2023 blog post, OpenAI’s Jan Leike and Ilya Sutskever reckon with a two-pronged challenge: not only do we not have a solution “for steering or controlling a potentially superintelligent AI, and preventing it from going rogue,” but the current dominant paradigm in AI safety—reinforcement learning with human feedback—is unlikely to work as models outpace human capabilities. However, Leike and Sutskever are also optimistic about the possibility of new AI research as a remedy, with a target of 2027 for developing technical alignment solutions to prevent or mitigate the risk from rogue AI.
Delegates at Bletchley Park resolved to define a set of “decisions that should not be handed to an AI system.” After a year in which AI doomsayers played a major role in the public discourse, current systems are instead viewed as “relatively easily controlled,” and “do not present an existential risk.”
AI Safety and Societal Diffusion
Unlike rogue AI, which the summit’s readout acknowledges may not necessarily materialize, AI diffusion throughout the economy and society is inevitable. This integration brings substantial risks, including perpetuating or amplifying biases and discrimination due to flawed or limited data sets. AI could displace workers and exacerbate inequality within societies and between countries. Economic concentration is another risk, where AI advancements could centralize power and wealth in the hands of a few companies or individuals who control these technologies (most of which, for now, seem to be domiciled in the United States, United Kingdom, or China). Moreover, the integration of AI could shred what remains of the public information ecosystem, if AI is deployed to create and spread disinformation, manipulate public opinion, and disrupt democratic processes.
In some ways, this risk category is the source of the greatest point of contention during and following the summit: whether a focus on frontier model safety actively undermines the proximate risks of bias, discrimination, and dehumanized automation. In a speech at the U.S. embassy on the first day of the summit, Vice President Kamala Harris channeled this criticism. “Consider, for example: When a senior is kicked off his healthcare plan because of a faulty AI algorithm, is that not existential for him?” The vice president arguably presented a more diplomatic version of a large community of skeptics’ criticism of the summit’s vision for AI safety—which, in some corners, has been vociferous.
There are many voices in civil society arguing that the narrow scope of the summit was not a commendable demonstration of focus, but instead represented a co-optation of AI safety by powerful corporate interests. From civil rights to labor, many stakeholders at the forefront of proximate risk debates worry that policymakers are abandoning their causes. It did not help perceptions that most of the civil society organizations in town for the summit gathered at a separate venue in the British Library, having not been included on the guest list at Bletchley Park.
Policymakers appear to be underestimating the skepticism of key constituents that need to feel empowered, not threatened, by AI. In some cases, that skepticism borders on anger—authors and artists, for example, almost universally use the word “theft” to describe this generation of large language models’ apparent inclusion of their works in training. Advocates are demanding coequal influence alongside technical personnel to design and deploy AI safely. The idea of AI perpetuating bias, discrimination, and inequality is so unpalatable for these groups that their advocacy often seems to reject a role for AI in resolving some of those risks. For example, such skepticism is manifest in the debate over a possible role for AI in law enforcement. There is deep concern among civil society that AI trained on biased historical data—such as sentencing data that is correlated with race, gender, and class—can be anything but a source of harm.
Michael Frank is a senior fellow in the Wadhwani Center for AI and Advanced Technologies at the Center for Strategic and International Studies in Washington, D.C.