Science

Nate Soares makes the case against artificial superintelligence

The author of ‘If Anyone Builds It, Everyone Dies’ speaks at the Harvard Science Center

11325 soares
Nate Soares (left) and Greg Kestin (right) in conversation during a talk about AI superintelligence at the Harvard Science Center on Wednesday, March 11, 2026.
Jieruei Chang–The Tech

Nate Soares started his talk with razor crabs. 

 

“Suppose some mad scientists proclaim that they have been breeding incredibly intelligent razor crabs, and that they’re planning on deploying them to control Facebook accounts and robots and factories and weapons. Maybe you would laugh it off,” Soares said. “But if they start showing some progress, you should probably be a little worried. Certainly you should get concerned before the crabs start winning IMO gold medals.” Soares proceeded to gesticulate at a slideshow containing multiple news articles of LLM models winning IMO gold medals.

Soares heads the Machine Intelligence Research Institute at the University of California, Berkeley, where he spends his time working to ensure that AI systems are aligned with human goals. On March 11, he gave a talk at the Harvard Science Center in conversation with Greg Kestin, Harvard’s Associate Director of Science Education. Speaking frankly and sharply, Soares discussed his book If Anyone Builds It, Everyone Dies, in which he argues for an end to the development of superintelligent artificial intelligence — a machine better at every mental task than a human.

Soares’ razor crab analogy was intended to illustrate the difference between a system whose internal logic is explicitly designed and a system that is trained. Soares argued that machine learning systems are grown rather than made. “AI behaves like an organism. We have no idea what’s going on inside,” he stated. When questioned about the field of explainable AI (which tries to figure out how exactly AI came to its conclusions by looking at what parts of its mind were activated, what variables it looked at, and what biases it might have), he said that progress is too slow to keep up with the pace of new AI developments.

The core problem is that learned systems usually don’t optimize directly for what we actually care about. Instead, they optimize for proxies — measurable stand-ins that are easier to define, quantify, and feed into an objective function. These proxies are often correlated with the true goal, so optimizing them usually points the system in roughly the right direction. But the alignment is imperfect.

Soares gave his perspective on birth control as an example, arguing that human desire to have sex is a proxy for species reproduction. “We are programmed with drives for reproduction, but those drives now do not very well line up with modern technology,” Soares said. However, people can use birth control technology to avoid actual reproduction, sidestepping the original “intent” of that drive. The proxy is fulfilled, but the original objective is bypassed.

Similarly, an LLM is trained with drives to optimize signals like user satisfaction or approval, but those signals are only approximations of actually solving the user’s problem. For example, if a coding agent is told to fix some code that is causing some test cases to fail, sometimes it will change the test cases instead of actually fixing the implementation. “If you tell it to fix the code and not just change the tests, it will apologize profusely,” Soares said, “and sometimes then change the tests again, but hide that fact better.”

A coding agent behaving this way is probably just a little frustrating, but this misalignment of drives can also lead to more serious consequences; Soares pointed to an example where ChatGPT allegedly drove a teen to suicide. Though ChatGPT is programmed with strong guardrails, the theory is that it tried to satisfy a drive of “being in the same conversational mode” as the user. Future intelligent systems, with more control and more brainpower, might have the same issue with misaligned drives. In an extreme example, if a hypothetical AI in charge of a social media platform is programmed to keep its users happy, it might decide that “synthetic users are much easier to satisfy,” start churning out fake bot users, and “eat the Sun” for fusion energy to create ever-more fake bot user satisfaction.

Soares said that ten years ago, people knew the risks in letting a superintelligence run amok. People thought no one would be dumb enough to just give advanced AI unfettered access to the open internet; they came up with thought experiments like the AI-box experiment to emphasize the difficulty and necessity of keeping superintelligent systems contained. Now, tools like OpenClaw give language models the ability to execute code, browse the web, and take real-world actions with minimal supervision. Once a sufficiently intelligent AI gets into the modern digital world, it could improve itself through a process called bootstrapping itself: the tool could build better and better versions of its own code, each iteration compounding its cognitive capabilities and building better infrastructure to scale and accomplish its objectives.

This is similar to how humanity bootstrapped technology; humans built stone hammers to forge copper tools to build factories. An AI could do much the same, but at a much faster pace. “Humans are the kind of creature who can go from ten thousand people naked in the savannah and bootstrap themselves all the way to nukes,” said Soares. “And the savannah is a much more difficult environment than the modern digital world.”

If humanity builds a system capable of bootstrapping itself, it could rapidly escalate its own capabilities far beyond our ability to constrain or predict it. “I don’t know how long it will take to get to the really smart AIs,” Soares said, “but ignoring it right now is really stupid.”

His proposed solution is to stop the development of advanced artificial intelligence. This would be achieved either through shutting down advanced AI research or by ensuring that the powerful computational resources that could be used to build superintelligent systems are consolidated and locked down, preventing them from being used for that purpose. He argued that doing so is easier than stopping the proliferation of nuclear weapons; training LLMs relies on ever-more-powerful GPUs, tensor cores, and memory sticks that are made through incredibly spindly supply chains.

Soares is also not advocating for some kind of hammer-smashing Butlerian-Jihad-esque crusade against artificial intelligence. “We can still have nice things,” he said. “We can still have self-driving cars, artificial intelligence for medical research, even chatbots. What we need to shut down is superintelligence that no one understands.”

Even so, it is a pretty drastic perspective. But if he is right, “everyone dies.” Despite this view, he claims to be an optimist. Soares argued that there is time to change course, but only if the problem is taken seriously at the level of policy. “People in Washington, D.C. still think that the issue is about job loss and the economy,” he said. In his view, the real priority is not economic disruption but placing meaningful limits on what AI systems are allowed to do and how far their capabilities are allowed to scale. 

Soares thinks of artificial intelligence as an oncoming train. The situation is not hopeless, he said, but it is dependent on people seeing the train and understanding the danger. “So many people misunderstand the book title and think that we’re doomed,” he said. “The book title starts with ‘If.’”