Science

Richard Sutton talks vision for superintelligence in Dertouzos Lecture on May 13

The 2024 Turing Award winner and “father of reinforcement learning” is an outspoken critic of LLMs

11382 dertouzos
Professor Richard Sutton delivers the Dertouzos Distinguished Lecture in the Kirsch Auditorium (32-123) on Wednesday, May 13, 2026.
Samuel Yuan–The Tech

On May 13, a packed Kirsch Auditorium (room 32-123) welcomed Prof. Richard Sutton for the Dertouzos Distinguished Lecture. Sutton, widely regarded as the “father of reinforcement learning,” spent the hour speaking about his vision for superintelligence and the architecture that he believes can make it possible.

A contrarian in his field, Sutton has recently gained notoriety for criticizing “large-scale pretraining” and Large Language Models (LLMs). He has asserted that while these methods have enabled breakthrough models like ChatGPT, they are a “dead end” for superintelligence. Rather, he contends that agents capable of learning “continually” and “from experience” at runtime (when they are deployed) hold the key to true AI. His lecture explored how to do the latter type of learning.

Sutton is currently a professor of computer science at the University of Alberta. He won the ACM Turing Award in 2024, and his publications have more than 180,000 citations. 

He is also the author of “The Bitter Lesson,” an influential essay on historical trends in AI that has informed how many researchers think about their work. In the essay, Sutton argues that models that scale with computing power ultimately perform much better than models that use knowledge programmed in by researchers.

Opening remarks for the lecture were delivered by MIT Computer Science & Artificial Intelligence Laboratory (CSAIL) Director and Professor of Computer Science Daniela Rus, who praised Sutton’s work and noted his ability to challenge the status quo.

“Beyond the technical contributions, Richard has been a consistent and provocative voice on the big questions in AI,” Rus said. “He’s really one of those rare scientists whose work is both foundational and still actively agenda setting.”

“The Bitter Lesson”

Sutton began his talk by framing his quest for an AI agent architecture, a system for how an AI can perceive, reason, and act, as one that respects “the bitter lesson” — namely, an architecture that minimizes the amount of human knowledge built in and instead scales intelligence from experience.

“My quest is to design an AI agent’s mind that is general, domain-independent, contains nothing specific to the world, and learns from experience,” Sutton explained. “An intelligent agent has to be able to tell herself that it’s doing well.”

His aversion to “domain knowledge” comes directly from the historical trends in models for chess and vision that he documented in “The Bitter Lesson.” For instance, he noted that vision models that learned their own image processing strategies performed better than models where humans programmed skills like edge detection.

“In the short term, it is personally satisfying to the person building in his knowledge to his agent.” Sutton said. “But in the long run, this building-in approach plateaus and even inhibits further progress. Progress eventually arises by the opposing approach based on scaling, computation, by searching, learning.”

LLMs are both an example and counterexample to his essay: while they scale on massive computation, they also rely on human domain knowledge from their training datasets.

“We just have to see whether [LLMs] will eventually be superseded by an opposing approach,” he said.

In sketching a skeptical picture of LLMs, Sutton joins other AI pioneers — including Yann LeCun, who has recently raised over a billion dollars for a research lab that ditches language models for other methods — who have been critical of scaling LLMs as an end-all, be-all to reaching superintelligence.

Abstract and learn

Sutton then discussed his “big-world perspective” and why it makes learning during deployment, rather than at training time, essential for any truly intelligent agent. He explained that even setting aside the physical world, truly modeling other people or agents is intractable because their minds are just as complex as your own.

“You build a value function, a policy, and a little transition model of the world. You try to even represent the states of the world. [But] it’s too much, it’s too big,” Sutton said.

The complexity of the world means that the model will need to have hundreds of moving parts; therefore, models must be able to discover abstractions on the fly rather than having them engineered in advance.

“It’s not enough to just have to learn. You have to be able to get more complex [at runtime],” Sutton said. “Sure, we can build domain knowledge in, but you should also be proud of what your agent can do in itself at runtime. This is an essential part and a somewhat stronger statement.”

With that in mind, Sutton introduced the OaK architecture, short for “options and knowledge,” as his proposed solution.

“In the OaK architecture, the agent will have many options, and it’ll learn knowledge. The knowledge is what’s in the transition model; knowledge is a belief about options,” Sutton said. “We’re going to learn this high-level transition model of the world. Each feature produces a sub problem. Each sub-problem produces an option. Each option produces a part of the transition model.”

A central goal of this architecture is to enable “open-ended abstraction” by letting the agent generate its own sub-problems in pursuit of reward.

To illustrate the idea, Sutton played a video of a baby exploring a set of toys, moving from one to the next.

“What can I do with this toy, with this piece of string? Can I get that toy’s sound to occur again?” Sutton said, narrating the baby’s actions. “We have to create our own sub-problems. In no way are all possible sub-problems built in.”

He explained that this kind of play, which is found in all types of animals, is not random; instead, it represents a core part of what intelligence is, and it is something that a general AI agent would need to replicate.

“I think this is really an obvious insight. It’s an insight into what our mental life is like. We set some goals. We set problems for ourselves,” Sutton said. “We learned to play soccer, or we learned to ride a bike, we learned to use a foreign language — we set some problems for ourselves, and then we worked on achieving them.”

The road ahead

To make the implementation of OaK architecture possible, researchers will need to solve the technical challenges of continual learning — how neural networks can learn new things without forgetting old ones — and generating state features from new state features.

“OaK will require reliable continual learning,” Sutton said. “We needed to do this for 40 years, literally, 40 years. [And] if you do conventional deep learning it can just fail catastrophically, either [due to forgetting] or loss of plasticity.”

But he added that solutions, such as continual backprop and meta-learning techniques, are beginning to take shape. Still, Sutton claimed that the field isn’t there yet.

“I just want to acknowledge we don’t have it yet, and that’s the principal reason why I can’t show you large scale examples of the OaK architecture,” he said.

Despite these gaps, Sutton remains optimistic that a simple, elegant architecture for a generally intelligent agent remains in reach.

“We have a vision. It’s possible,” he said. “I think it’s possible to imagine achieving AI so that the core architecture [could be] something like five pages of code, [and] not like Windows that takes many books to write it down.”

Sutton added that he felt “The Bitter Lesson” and this new agent architecture would ultimately be vindicated in future textbooks.

“Years from now, we will understand AI, and [part of] the textbooks on AI will [say], ‘Sure, we should build in some good domain knowledge,’” Sutton predicted. “But the [rest will say], you want to grow knowledge, complexity, and conceptual structures in an independent and open-ended way, like we see in the OaK architecture.”

A recording of the talk at MIT has not been made public, but a recording of a similar talk Sutton delivered on the OaK architecture is available online from the Alberta Machine Intelligence Institute.