University News | August 10, 2023

Embracing AI

The dawn of the virtual teaching fellow

by Jonathan Shaw

Harvard’s introductory computer science course is using AI as a teaching aid.
Montage illustration by Niko Yaitanes/Harvard Magazine; duck courtesy of CS50; computer code by Unsplash

“Thank you, weirdly informative robot,” wrote a student taking Harvard’s introductory computer science course this summer, after receiving help from an AI-powered avatar. The development of generative artificial intelligence, which can create and synthesize new material based on publicly available sources, has sparked widespread concern among educators that students will use it to complete assignments and write essays, subverting the process of teaching and learning. And many are responding by instituting new rules restricting the use of AI in classwork. But in “CS50: Introduction to Computer Science,” with its global reach, course leaders are instead embracing AI. This summer, when the class of 70 is much smaller than in the fall (when enrollment swells nearly ten times), McKay professor of the practice of computer science David Malan and his teaching fellows have been testing an AI developed specifically for the course. Their goal is to prototype an intelligent subject matter expert who could help students at any time of day or night, answering questions and providing feedback far beyond normal teaching hours. Ultimately, says Malan, they aim to create an intelligent aid so adept that the result is “the approximation of a one-to-one teacher to student ratio.”

Tools like ChatGPT—the source of much anxiety about how to distinguish original student work from that generated using AI—are built on platforms like the one developed by OpenAI, a non-profit artificial intelligence research laboratory (with a for-profit subsidiary) based in the United States. The AI works by iteratively choosing the next word in any response that it gives through a probabilistic evaluation of existing material drawn from publicly available online sources, with a little bit of randomness thrown in. That’s an oversimplification, but regardless, Malan says, ChatGPT and other AI tools are already too good at writing code (and essays) to be useful for teaching beginning computer science students: it could just hand them the answers. The AI that he and his team have built uses the same Application Programming Interfaces as ChatGPT (APIs allow projects and databases to talk to each other in a common programming language), but with “pedagogical guardrails” in place, so that it helps students learn how to write their own code.

The AI in CS50 appears on-screen as a rubber duck. “For years we have given out actual rubber ducks to students,” Malan explains, “because in the world of programming, there’s this concept known as ‘rubber duck debugging.’ The idea is that if you’re working alone,” with no colleague, roommate or teaching fellow (TF) to check your logic, “you’re encouraged to talk to this inanimate object—this rubber duck”—because expressing your thoughts aloud can help uncover “illogical constructs” in the code. For the past few years, CS50 has used the digital duck in a limited way within its course messaging system. “Now we are bringing the duck to life” virtually at least, Malan notes with a grin.

David Malan
Photograph by Leroy Zhang/CS50

The CS50 team plans to endow the AI with at least seven different capabilities, some of which have already been implemented. The AI can explain highlighted lines of code in plain English just the way ChatGPT might. “These explanations are automatically generated” notes Malan, and line-by-line, tell students exactly what the code is doing. The AI duck can also advise students how to improve their code, explain arcane error messages (which are written to be read by advanced programmers), and help students find bugs in their code via rhetorical questions of the kind that a human TF might pose (“you might want to take a look at lines 11 and 12”). Eventually, CS50’s AI will be able to assess the design of student programs, provide feedback, and help measure student understanding by administering oral exams—which can then be evaluated by the human course staff reviewing transcripts of the interaction.

CS50 TFs typically spend three to six hours per week providing qualitative feedback on students’ homework, and this is “hands down the most time-consuming aspect of being a TF.” Empirical measurement has revealed that “students typically spend zero to 14 seconds reading that same feedback,” Malan says, “so it has never been a good balance of resources—of supply and demand.” The hope is that AI, by providing personalized feedback, will help “reclaim some of that human time so that the TFs can spend more time in sections and office hours actually working with students.”

“One of the most impactful experiments we’ve been running this summer,” Malan continues, has been the use of a third-party tool called Ed (edstem.org), an online, software-driven question and answer forum. “A few months ago, they wonderfully added support for the ability to write code that can talk to outside servers like ours, so that when a student posts a question on this software (for which Harvard has a site license), we can relay the question over the Internet to our server, use those OpenAI API's to try to answer the question, and then have the duck respond to the students within the same environment.”

The technology is also paired with new language outlining course expectations. Students in the summer program were told that the use of ChatGPT and other AIs is not allowed—but that using CS50’s own AI-based software developed specifically for the course is reasonable. Malan expects to roll out similar language in the course this fall. “That’s the balance…we’re trying to strike,” he says. The software presents “amazingly impactful positive opportunities,” but not “out of the box right now. So, we’re trying to get the best of both worlds.”

For some time, Malan has partnered with Harvard’s Embedded EthiCS program—and discussions about academic dishonesty are also woven into the curriculum so that students understand the course expectations and their purpose. Student participation in such conversations has become increasingly important as detecting work generated by ChatGPT and other AIs has become more difficult. For instance, a detection tool developed by OpenAI, the creators of ChatGPT, was released in February. It had been designed to distinguish between AI-written and human-written text. But in July, just five months after its introduction, the tool was quietly retired due to its “low rate of accuracy.” Says Malan, “As AI gets better, it probably will become indistinguishable from what a human might have written.” Soon, AI that is akin to having a 24/7 personal educational assistant who can answer any question will be widely available. “That’s wonderfully empowering if it’s used ethically and in a way that’s consistent with what we’re all here to do, which is presumably to learn.”