Strategic Planning

A chess player’s moves on AI safety

by Nina Pasquini

From The March-April 2024 Issue

Naomi Bashkansky sits on a table with a chess board behind her.

Naomi Bashkansky ’25 could spend hours at a chessboard pondering the perfect move. But as a child playing in competitive tournaments with time limits, such painstaking deliberation wasn’t possible—often to her frustration. During one match when she was 12, she spent 20 minutes debating whether to move her bishop one or two squares forward from its starting position, agonizing over the potential consequences of her decision 15 or 20 moves down the line. “In this case, the perfect move was not worth it. A good move would have sufficed,” she remembers. “So that’s not a great example of 20 minutes well spent.”

Though time limits could be an impediment, there were also instances when they were helpful. Opponents were incentivized to make simpler, more obvious moves faster, and to spend longer on trickier ones—providing insight into whether Bashkansky should be looking out for unexpected tactics.

There are no such hints as Bashkansky—now a junior studying computer science and a member of the AI Safety Student Team (AISST), a group of Harvard students that conducts research to reduce risks from advanced artificial intelligence—plays against Komodo 25, an Elo level-3500 chess engine that relies on AI techniques such as algorithms and strategic heuristics. (The highest Elo rating ever achieved by a human, the world champion Magnus Carlsen, was 2882.) Whether it is playing a “very boring positional move or a genius tactic that I won’t see until three moves later,” she explains, Komodo always responds immediately.

Bashkansky first picked up chess after watching her brother (five years older) play: it “immediately clicked.” She became the state of Washington’s kindergarten chess champion, then went on to compete in international competitions, eventually becoming a Woman International Master—the second-highest title awarded exclusively to women—at 14. She enjoyed analyzing positions and calculating moves—but she also just liked winning. After she was awarded the kindergarten chess championship, a local reporter asked her what about chess attracted her. Bashkansky said she liked “to win trophies and dollars.”

Bashkansky starts the game against Komodo as she usually does: she moves her queen’s pawn two squares forward, “trying to grab a lot of space in the center.” (The potential for aggression in chess also attracted her to the game: “I really liked going for the kill—pushing my pawns forward, winning pieces, tackling my opponent’s king,” she remembers with a laugh. “I liked when my opponent felt squeezed…to see the hope slowly dying inside them.”) As the game develops, Bashkansky begins to look out for the pawn she’s moved to C4. “It can get a little weak,” she explains, pointing out how Komodo’s bishop and knight could threaten it down the line. “I wonder how, if there’s ever a moment to trip me up to win this pawn, [Komodo] will do it,” she says. A human opponent’s attack would be easier to spot since they would likely spend “a few minutes every move to figure out how the pawn can be captured.”

As she deliberates moves, Bashkansky scratches her chin, runs her hands through her hair, bites her lip. Eventually, Komodo moves its queen diagonally across the board, threatening Bashkansky’s own queen—a move she did not expect. She pauses. “Oh,” she says. “Black has figured out how to win my pawn”—the C4 pawn she’s been looking out for since the beginning.

As a child, Bashkansky often competed at long tables in the large, open spaces of hotel conference rooms, where sometimes hundreds of others also played against each other. Today, she sits cross-legged on a computer chair in the conference room of the AI student team’s office—a sunny, expansive space with a large window that looks out onto the corner of Church and Brattle Streets. Along one of the walls runs a floor-to-ceiling whiteboard covered in colorful dry-erase scribblings: the p’s and q’s of logical conjunctions, long sequences of numbers (“maybe someone trying to do some empirical tests on a model, seeing if an AI can figure out if it should continue this pattern,” she explains), and KL-divergence equations (which measure differences between probability distributions, such as inputs and outputs of an AI model).

There, she and some 60 other undergraduate and graduate students conduct research and reading groups focused on the risks posed by increasingly advanced artificial intelligence models—the kind of technology that has made possible both advanced chess engines and large language models like ChatGPT.

Beyond the threat that AI poses to her favorite game, Bashkansky and fellow AISST members are motivated by a belief that AI safety is among the most crucial research questions today. Bashkansky is particularly interested in AI interpretability—the ability to understand the decisions and outputs of an AI system. (Because AI models learn patterns and relationships on their own during a process called reinforcement learning, humans struggle to understand their inner workings.) “These AI models are not so scary right now. But in 5, 10, 15 years, they could become more agentic,” Bashkansky says—they could develop the capacity to make decisions and act autonomously. “And then, it becomes important to ask, for example, how do we make sure that AI is not deceiving humans?”

AISST was founded in the spring of 2022, with funding for its office and operations provided by a philanthropic donor. (The group is not yet an officially recognized student organization, though its leadership hopes to achieve that status soon.) The members address AI safety through policy research (how governments and companies can regulate AI usage and development) and technical research (the mechanics of machine learning itself). As director of technical programs, Bashkansky focuses on the latter. She co-runs a semester-long introductory reading group on technical AI safety research. She also leads a machine-learning skills program on the weekends to equip members with a less technical background with the skills they need to undertake research.

Fellow AISST member Gabriel Wu ’25 remembers a time last year when he and Bashkansky were discussing their future careers. Wu confessed that he couldn’t see himself breaking from his current trajectory—likely toward a career in academia, rather than more hands-on AI safety work. “And Naomi [Bashkansky] said, ‘Even if it would be more comfortable to go into academia, or do some random thing that interests you, it’s your duty as one of the few people in the world who has the power to make change to do so,’” he remembers. As undergraduates with an affinity for computer science at one of the world’s top universities, who are beginning their careers amid the rapid development of machine learning models, she argued, they have a unique ability to influence the development of AI. “And that was just such a Naomi thing to say.” Wu now serves as AISST’s director and plans a career in AI safety.

Because AI safety is such a nascent field, it can be hard for classes to keep pace with real-time developments, Bashkansky and Wu say—and AISST can provide a setting for up-to-date research and study. “What AI safety research looked like five years ago is very different from what it looks like now,” Bashkansky says. “There are no classes here at Harvard that address this issue in the way that we think of it,” Wu adds. (Though some courses, such as those offered as part of Harvard’s Embedded EthiCS program, cover ethical questions related to AI, they don’t always engage with the most advanced research and policy proposals related to the topic, some students say.) AISST also provides a context where students can collaborate outside of the confines of academic disciplines, enabling technical researchers like Bashkansky to collaborate with more policy-minded students.

Bashkansky has also found other avenues outside of class to study AI interpretability. Last semester, she worked on a project with McKay professor of computer science Martin Wattenberg and graduate student Kenneth Li that explored the extent to which chatbots follow instructions over time. The trio were ultimately able to graph how chatbots gradually begin to ignore instructions—without indicating to users that they are doing so. “Seeing quantitative evidence is fascinating because we all have intuitions about AIs from using them. But at the end of the day, those are just anecdotes,” Wattenberg says. “And it’s wonderful to actually be able to quantify things and hopefully look for interventions that improve those systems.”

With research and classes filling her days, Bashkansky no longer plays chess very often. “Some people have asked if I am going to get back into chess and try to finish up and get my grandmaster title,” she says. Given the urgency she feels about AI safety, she says, “I’m probably not going to have time to do this.” But she doesn’t rule out the possibility completely: “Maybe if we are successfully able to align AI to human values, and then we live in some lovely utopia where no one has to work and everyone’s happy,” she says, laughing, “okay, maybe then I’ll get back into it.”

Published in the print edition of the March-April 2024 issue, in the University People section.