Language as a litmus test for AI

And the problem of bias

Illustration by OstapenkoOlena/iStock by Getty

Return to main article:

Language, which clearly played an important role in human evolution, has long been considered a hallmark of human intelligence, and when Barbara Grosz started working on problems in artificial intelligence (AI) in the 1970s, it was the litmus test for defining machine intelligence. The idea that language could be used as a kind of Occam’s razor for identifying intelligent computers dates to 1950, when Alan Turing, the British scientist who cracked Nazi Germany’s encrypted military communications, suggested that the ability to carry on a conversation in a manner indistinguishable from a human could be used as a proxy for intelligence. Turing raised the idea as a philosophical question, because intelligence is difficult to define, but his proposal was soon memorialized as the Turing test. Whether it is a reasonable test of intelligence is debatable. Regardless, Grosz says that even the most advanced, language-capable AI systems now available—Siri, Alexa, and Google—fail to pass it.

The Higgins professor of natural sciences has witnessed a transformation of her field. For decades, computers lacked the power, speed, and storage capacity to drive neural networks—modeled on the wiring of the human brain—that are able to learn from processing vast quantities of data. Grosz’s early language work therefore involved developing formal models and algorithms to create a computational model of discourse: telling the computer, in effect, how to interpret and create speech and text. Her research has led to the development of frameworks for handling the unpredictable nature of human communication, for modeling one-on-one human-computer interactions, and for advancing the integration of AI systems into human teams.

The current ascendant AI approach—based on neural networks that learn—relies instead on computers’ ability to sample vast quantities of data. In the case of language, for example, a neural network can sample a corpus—extending even to everything ever written that’s been posted online—to learn the “meaning” of words and their relationship to each other. A dictionary created using this approach, explains assistant professor of computer science Alexander “Sasha” Rush, contains mathematical representations of words, rather than language-based definitions. Each word is a vector—a relativistic definition of a word in relation to other words. Thus the vectors describing the relationship between the words “man” and “woman” would be mathematically analogous to those describing the relationship between words such as “king” and “queen.”

This approach to teaching language to computers has tremendous potential for translation services, for developing miniaturized chips that would allow voice control of all sorts of devices, and even for creating AIs that could write a story about a sporting event based purely on data. But because it captures all the human biases associated with culturally freighted words like “man” and “woman,” and what the ensuing mathematical representations might embody with respect to gender, power dynamics, and inequality when confronted with the associations of a word such as “CEO,” it can lead neural-network based AI systems to produce biased results.

Rush considers his work—developing language capabilities for microscopic computer chips—to be purely engineering, and his translation work to be functional, not literary, even though the goal of developing an AI that can pass the Turing test is undoubtedly being advanced by work like his. But significant obstacles remain.

How can a computer be taught to recognize inflection, or the rising tone of words that form a question, or an interruption to discipline kids (“Hey, stop that!”), of the sort that humans understand immediately? These are the kinds of theoretical problems Grosz has been grappling with for years. And although she is agnostic about whatever approach will ultimately succeed in building systems able to participate in everyday human dialogue, probably decades hence, she does allow that it might well have to be a hybrid of neural-network learning and human-developed models and rules for understanding language in all its complexity.

Read more articles by Jonathan Shaw

You might also like

Faculty Set to Vote on Grade Inflation Proposal

Results of the email ballot will be announced on May 20.

Jason Furman to Lead Center for Business and Government

The new director of Harvard Kennedy School’s Mossavar-Rahmani Center bridges economic research and policy.

Harvard Awards Teaching and Mentoring Prizes

Harvard College and GSAS recognize outstanding faculty contributors.

Most popular

Martin Nowak Placed on Leave a Second Time

Further links to Jeffrey Epstein surface in newly released files.

AI Outperforms Doctors in Emergency Room Tasks, New Harvard Study Shows

Researchers say the technology could help physicians with triage, diagnosis.

Radcliffe Institute Announces 2026-2027 Fellows

Scholars will tap Harvard’s intellectual resources during the coming academic year.

Explore More From Current Issue

Woman in historical dress standing in front of green foliage, smiling brightly.

This Harvard Graduate Brings Women of the Revolution to Life

Historical reenactor Lauren Shear reveals tricks of the trade for playing Tory loyalists, Revolutionary poets, and more.

A dancer in a black leotard poses gracefully in a bright studio, with mirrors reflecting her movement.

A New Black Swan Musical Cranks Up the Tension

The creative team of the A.R.T.’s new show dish on adapting Darren Aronofsky’s thriller classic from screen to stage.

Colorful illustrated map of Colonial Cambridge and the Harvard College campus featuring buildings of the campus, houses, Cambridge Common, and the Charles River

250 Years Ago, Harvard Was Home to a Revolution

A look at the sights, sounds, and characters that put the University on the frontlines of history