Right Now | Lasting knowledge
A New Light on DNA Storage
Digital-data production is expanding so fast that, within two decades, storing it in flash-drive memory chips could consume 10 to 100 times the anticipated supply of microchip-grade silicon. With new ways of storing information desperately needed, Winthrop professor of genetics George Church is turning to one of the oldest means of doing so: the DNA molecule, which has been replicating and mutating on Earth for three and a half billion years. By 2025, accumulated global data is expected to reach 175 billion trillion bytes—all of which could, in principle, be contained in less than 180 pounds of DNA, housed within a 15-gallon drum.
DNA stores information in a “modern” way, explains Church, who heads the synthetic biology group at Harvard’s Wyss Institute for Biologically Inspired Engineering (see “Engineering Life,” January-February 2020, page 37, for more about synthetic biology). Digital information storage, he explains, “is based on just two numbers, 0s and 1s, and DNA is analogous.” Its code has just four letters: A (adenine), T (thymine), C (cytosine) and G (guanine)—the bases or nucleotides comprising the rungs of DNA’s double helix-shaped ladder, which can be arranged in whatever order scientists choose.
In an October 2020 paper in Nature Communications, Church and colleagues described an advance that brings DNA information storage closer to commercial feasibility. They showed for the first time that DNA could be synthesized, and information thereby encoded, in an enzyme-facilitated process controlled by light. The team also demonstrated another first: the use of enzymes to achieve parallel synthesis of multiple DNA strands. They pulled two measures of music from a Super Mario Brothers video game, digitized it, converted it to a DNA code, and synthesized it. The DNA was then sequenced to decipher its code, redigitized, and converted back to a musical format.
The point was to test a new approach to synthesis involving both enzymes and light. The first of these steps was demonstrated in 2019 when Church’s group showed that terminal deoxynucleotidyl transferase (TdT)—an enzyme found in immune cells—could be used for DNA synthesis and information storage. TdT works well, says Wyss staff scientist Daniel Wiegand, “because its only job is to find the end of a DNA chain and add a base to it.”
This enzymatic approach offers several advantages over standard chemical methods of manipulating DNA. Whereas chemical synthesis normally takes about three minutes to add a single base, Church explains, enzymatic synthesis can do that in a fraction of a second, without generating large volumes of toxic wastes. Chemical synthesis, moreover, can produce only small DNA molecules, composed of 300 or fewer base pairs. Enzymatic synthesis can create much larger molecules, with thousands of base pairs, allowing orders of magnitude more information to be encapsulated.
The team’s key 2020 innovation was incorporating a technique called photolithography (used for decades to etch semiconductor chips) into DNA synthesis. A 1.2-square-millimeter surface with 12 distinct spots was arranged into a three-by-four grid. Then an individual strand of DNA, representing a discrete portion of the music, was synthesized in each spot, one base at a time.
The information could potentially be preserved for hundreds of millions of years.
Doing that required flooding the entire surface with a solution containing TdT, cobalt, and one of the four DNA bases—A, for instance. Using a device built by research engineer Howon Lee (the paper’s lead author), ultraviolet light was directed only to those spots where an A nucleotide was supposed to be added next. The light liberated cobalt ions that, in turn, activated the enzyme to insert A at the end of the DNA chain. The fluid was then flushed out and replaced with a solution that included G, for example, and light was directed to spots where G was needed. Many such cycles—with liquids flowing in and out and UV light turning off and on—can occur within a second, and in this way 12 different DNA molecules were gradually built up.
Future steps will involve packing more spots onto the grid, thus increasing the number of DNA strands synthesized at once and the amount of information they hold. “If you can do it for 12 molecules,” says Lee, “you can do it for 10,000.” And bringing some automation to the process offers another advance, explains his colleague, research scientist Richie Kohman: “Rather than using pipettes to move enzymes and bases around, you can just hit a button and let the instrument do it.”
DNA won’t replace thumb drives for storing and retrieving data, Kohman notes, but would be reserved instead “for archival purposes—safeguarding things humanity wants to put in a vault for a long period of time.” The Library of Congress holdings might be a good place to start.
“It might be possible to protect synthetic [and encoded] DNA for a million years,” Church says. But by inserting encoded DNA into hardy bacteria (which can reproduce themselves and “repair” their genetic material for free), he added, the information could potentially be preserved for hundreds of millions of years.