Faculty & Community | 4.6.2022

Authoritarian Regimes’ AI Innovation Advantage

Unfettered access to personal data may give Chinese companies an edge in artificial intelligence.

by Daniel Oberhaus

From The May-June 2022 Issue

Illustration of an eye superimposed on iconography from the national flag of China

Illustration by Taylor Callery

For the past decade, China has led the world in advanced-facial recognition systems. Chinese companies dominate the rankings of the National Institute of Standards and Technology’s Face Recognition Vendor Test, considered the accepted standard for judging the accuracy of these systems, and Chinese research papers on the subject are cited almost twice as often as American ones. Many experts recognize the importance of facial-recognition and other artificial-intelligence applications for promoting future economic growth through productivity gains, which makes understanding how China came to dominate this field a competitive concern. And after years of research, Harvard assistant professor of economics David Yang believes he’s discovered an explanation for the Chinese companies’ advantage.

In a recent National Bureau of Economic Research working paper, Yang and his colleagues found that authoritarian states like China may have an inherent and decisive advantage over liberal democracies in facial-recognition innovation. Their secret? The flow of massive amounts of surveillance data to private AI companies that develop facial-recognition software for local police departments. Like all AI systems, facial-recognition systems depend on substantial quantities of training data to learn how to recognize faces: the more data they process, the more reliable they become. The researchers’ findings suggest Chinese AI firms use surveillance data to develop sophisticated facial-recognition algorithms that can later be repurposed for commercial applications like more effective targeted advertisements and tracking of customer behavior in stores, which creates a feedback loop where greater mass surveillance drives economic growth.

It’s a startling conclusion that will likely have significant repercussions for the commercialization of AI systems in the United States and Europe. Yang’s findings suggest that policymakers in the West need to start considering the economic tradeoffs that come with strong protections for personal data rather than focusing solely on the value of privacy. In fact, the research suggests that overly strong data protections may turn out to be inadvertent anti-industrial policies that starve fledgling AI companies of the data they need to innovate. Unless these tensions are addressed soon, the United States may find itself far behind in the race to build one of the most important technologies of the twenty-first century.

At the core of China’s burgeoning facial-recognition industry is an Orwellian mass surveillance system unparalled in scope and scale. More than half of the world’s roughly one billion surveillance cameras are within China’s borders, and nine of the world’s 10 most surveilled cities on a cameras-per-capita basis are Chinese. If Chinese companies are able to tap into this video surveillance network through government contracts they should be able to develop superior algorithms and parlay these systems into commercial applications.

“If you want to do any sort of customer identification or personalized advertisements, then being able to identify who comes into a shop for the first time and who is a repeat customer is super important information,” says Yang. “From an algorithm perspective, these commercial applications sound very similar to what a police department might want to do on its own. So once a company uses government data to improve its facial-recognition system’s accuracy rate, it’s natural that it would also use it to help retailers make predictions about their customers.”

It’s an intuitive theory, but until Yang and his colleagues collected the data there was scant evidence to support the hypothesis. The researchers started by analyzing publicly available listings of Chinese facial-recognition companies and their major products, which the government requires to be reported to its Ministry of Industry and Information Technology. They identified approximately 1,000 companies that had received a contract to supply local police departments with facial-recognition software, which presumably came with access to local surveillance video data.

The researchers’ next step was to determine whether the volume of video data received by these firms correlated with their output of commercial facial-recognition products. They sorted the contracts by the number of surveillance cameras in the police district and labeled contracts with above average numbers of cameras “data-rich” and those with below average numbers of cameras “data-scarce.”

There may be “economic tradeoffs that come with strong protections for personal data….”

If access to data-rich government contracts didn’t influence the development of commercial facial-recognition software, the researchers would have expected to see a drop in commercial products as firms reallocated engineering talent to service the government contract. But they found the opposite. Firms that secured a data-rich facial-recognition contract with police departments almost immediately produced more facial-recognition software for government and commercial applications than those with a data-scarce contract. This suggests that the government data boost AI innovation because they are applicable to both government and commercial applications.

“Our speculation is that this happens because of algorithm spillover,” says Yang. “The companies don’t need to decide between allocating engineering resources to government or commercial applications because the same algorithm can be applied in multiple places at the same time.”

The findings have important implications for how the United States and other western countries think about the tradeoff between data privacy and economic policies. While Yang is hesitant to draw any normative conclusions from this work, he says that it does show how one-sided the debate on AI and data ethics has become in the West. By focusing only on the privacy implications of facial-recognition technology, policymakers have ignored the economic effects of denying data to AI firms that need it to develop their technology. As a result, facial- recognition firms have flourished in authoritarian countries like China that lack robust data-privacy protections—and languished in the United States and Europe.

Yang says the key to unlocking similar AI innovation in the West may involve reframing access to government data as key public infrastructure comparable to roads or bridges without sacrificing basic privacy protections and civil liberties.

“A lot of privacy-protection policies could have anti-industrial consequences because it hurts industries that rely on that data,” he says. “I’m not suggesting whether these policies are or are not justified, but there is an asymmetry in the discussion that only focuses on the value of privacy. Once the economic value is taken into consideration, we can start to think about whether the civil-liberty costs outweigh the economic potential of these innovations.”

Published in the May-June 2022 print issue in the Right Now section.