Science | April 10, 2018

Creating gold-standard electronic health records

Physicians bring data science to bear on patient health and wellness information.

by Jonathan Shaw

From The May-June 2018 Issue

**Dean Stanley Shaw and Dr. Deborah Schrag** | From left: Courtesy of Stanley Shaw and Deborah Schrag

Return to main article:

Using precision medicine, Harvard researchers target cancer

Linnea Olson, shown with her dog, Kumo, has survived 13 years with lung cancer.

A key ingredient in precision cancer medicine—and precision medicine generally—is knowing what works for patients. Historically, a substantial amount of such medical knowledge has come from observations of patients participating in carefully monitored clinical trials. However, to deliver truly state-of-the-art precision medicine, doctors will need to draw on the “real world” experiences of large populations of patients. The vehicle for consolidating that type of information is the electronic health record (EHR). Such a record could theoretically hold a patient’s entire medical history, including a full genome sequence, as well as information gathered from wearable devices about lifestyle, behaviors, or environmental exposures. The goal for healthcare providers is to build a single data repository for each patient containing that individual’s phenotypes—any quantifiable or observable trait or behavior—in order to deliver precision medicine empirically. When deployed at scale, such records would enable doctors to compare their patients’ symptoms, for example, to the histories of similar patients who were successfully treated in the past.

Leading efforts to leverage such real world clinical data on behalf of Dana-Farber Cancer Institute (DFCI, as part of a consortium of seven cancer-care hospitals), is professor of medicine Deborah Schrag, chief of the division of population sciences.

The work sounds straightforward but is, in fact, beset by daunting challenges. Consider the simplest kind of information, a binary data field that records whether a patient is alive or dead. Cancer patients often seek care at an academic medical center, Schrag says, even if they live far away. But after a series of consultations, and perhaps the formulation of a treatment plan to be administered through a local hospital or clinic, those patients may never return to the specialty center. The life-or-death results of those consultations and treatments may never be captured, even by a local hospital. Without that most basic information about how long a patient survives, evaluating the effectiveness of any treatment program is impossible. It is therefore essential, Schrag says, “to aggregate comprehensive data that characterize both the molecular profiles of cancers, and also the clinical outcomes,” including the most basic one: vital status.

DFCI overcomes this fundamental problem of tracking patient survival by linking its data to the National Death Index (NDI). Once a year, Schrag explains, the institute submits the names of persons thought to be alive, but who haven’t been seen in more than a year, to the NDI, which for a fee, provides “validated, reliable, vital status data.”

But many hospitals don’t take this step—and for a system that aspires to capture the full complexity of patient outcomes, that is just the beginning. Any system should be able to track not only whether a given treatment has helped patients live longer, but also whether it has improved their quality of life. The latter is a subjective value that only a patient can decide, so the system must capture patient-reported outcomes in a readily validated way for all 500 or so types of cancer. The system must also record covariates—such as the presence of other medical conditions or age—that can influence the success or failure of a particular therapy. And it must capture clinician-defined outcomes such as whether a cancer responded to a particular treatment, information that is typically buried in the text of physician notes. This is the sort of analysis that humans can do in an hour, but the goal is to gather all these data efficiently, using techniques such as natural language processing, in such a way that a machine could analyze them to generate actionable information.

As part of an American Association for Cancer Research project called GENIE, the consortium of hospitals is trying to raise funds to create a human-validated dataset of “many thousands of cases in which we have the linked genomics and phenomics, including detailed information about the outcomes of cancer treatment,” says Schrag. “This patient with genomic profile X took drug 1, responded, took drug 2, didn’t respond, took drug 3, responded, and so on.” This information, when aggregated at scale, could be used to inform treatment decisions for future patients.

Then a machine could be given a dataset of just the lung-cancer patients, for example, and instructed, as a start, “to divide them into three groups: those who improved, those who remained stable, and those who became worse.” For that to happen, says Schrag, “We need gold-standard, validated data.”

Clinical trials like those Linnea Olson has participated in (see main article) capture this sort of data. But fewer than 10 percent of cancer patients, even at centers like DFCI, are treated in clinical trials—which are also limited in that they measure the efficacy of a treatment in controlled settings, rather than the real-world effectiveness of any therapy regime. (Restrictions on patient eligibility may further bias the outcomes of trials.) EHRs for large populations of cancer patients, on the other hand, aspire to capture such public-health-level data about optimal cancer therapies: what really works in the broadest sense.

The promise of comprehensive, standardized EHRs extends far beyond cancer, and even beyond insights into other diseases. Such records are among the vanguard of several emerging types of nontraditional medical data that promise to provide a richer picture of health and disease, not only during clinic visits, but also in people’s daily lives, explains cardiologist Stanley Shaw, Harvard Medical School’s associate dean for executive education (and husband of Alice Shaw; see main article). Wearable devices such as smartwatches, and even smartphones, can already “provide meaningful information about physical activity, social interactions, and so on,” he explains, in ways that are “useful to physicians and patients in cases of cardiovascular disease, mental illness, and musculoskeletal and neurodegenerative conditions.” For example, smartphones can capture “three-dimensional information about the gait” of Parkinson’s patients, or test their ability to tap targets on the phone screen to indicate how they are responding to therapies.

But the moment when electronic health records will truly have a broad impact is when they can do “for physiological measurement what a program such as Google Maps has done for navigation,” Shaw says: letting the viewer “go from a street-level photograph that shows you the bookstore on the corner, to a view of the streets in the neighborhood, all the way up to Google Earth.” Just as geolocation is the common thread for multilayered Google Maps data, individuals would be for EHRs, he explains. “In theory, you could have data, measurements, or phenotypes of people at different biologic scales, from their DNA, to cells, all the way up through holistic views of their physiology or behavior”—a personal, multilayered biomap with views at a range of different scales, all seamlessly integrated.

Bringing such a project to fruition is likely the work of decades and could cost billions. At the moment, Shaw says, “We don’t have anything close to that three-dimensional stack of data on everybody; it is fragmented and bound by silos.” But the first step is creating comprehensive data sets—such as those Schrag and her colleagues are assembling for cancer—and then unleashing powerful data-science tools to analyze them. From that, Shaw adds, “some powerful insights will flow.”

Published in the May-June 2018 print issue under the headline “Toward a Personal Biomap,” in the section.

You might also like

A man sitting at a wooden table in a formal setting with portraits on the wall.

A theatrical reenactment explores a 1976 clash between science and democracy.

Two figures stand before a large, colorful pixelated face against a yellow background.

Harvard scientists identify hundreds of genes under selective pressure.

Colorful abstract design resembling an octopus with intricate swirls and patterns.

Growing liver implants, mapping the sense of smell, and journalism at risk

Most popular

Soccer player and goalkeeper Matt Freese standing in front of the net

The former economics concentrator brings his talent for crunching numbers to netminding.

A blue refrigerator covered with animal pictures, notes, and drawings, surrounded by greenery.

An animal’s journey from grief to love shows how much humans need each other, too.

White House and Harvard University buildings split diagonally with contrasting colors.

Harvard Weathers a Year of Turmoil

The federal government has launched unprecedented actions against the University. Here’s a guide.

Explore More From Current Issue

Five individuals are posed in a monochrome outdoor setting near a cinderblock building, some standing, some seated.

Visual Arts

Photographer and writer Morgan Smith chronicles life beyond the violence in Ciudad Juárez and other Mexican towns.

Katie O’Dair in academic regalia holds a ceremonial staff outdoors at a graduation ceremony.

How Katie O’Dair makes kings, comedians, and parents feel welcome on campus.

Massachusetts Hall at Harvard Red brick building with a large clock on top, surrounded by green trees.

University News

With a grade inflation vote and in the courts, the University argued that it’s taking steps to change.