Big data is a powerful new tool in the medical bag, and one that can put patients in charge of their own health.


The virtual aura hovering around Kiah Sullivan glows with information, all of it precisely empirical, if not all exactly welcome. The athlete from Port Angeles, who delights in eating a healthy, plant-rich diet, figured her cholesterol would be pitch perfect for a 24-year-old woman. But when she got her labs back, she was shocked to see a number far higher than she expected.

“We assume young equals healthy,” Sullivan says. She’s on a break between classes at Washington State University Spokane, where she’s a second-year medical student in the Elson S. Floyd College of Medicine. She’s also a participant in a scientific wellness program, the product of a partnership between the medical college and Arivale, an innovative healthcare startup that may change the way we think of medicine, what it means to be well, and how we access

It all has to do with Sullivan’s cloud—a billion or more data points derived from her sequenced genome; numerous lab tests of blood, saliva, and stool samples; and a seemingly endless supply of personal questions, from her family’s health history to what she eats. That clarifying cloud of data, collected by Arivale and interpreted for her by an ever-inquisitive health coach, “made me think twice,” Sullivan says, about her assumptions of youth, health, and, more importantly for a doctor-to-be, how a genome gets expressed in an actual living, breathing person.

You can’t assume much of anything on just a few data points, such as age and diet, says College of Medicine dean John Tomkowiak, but that’s what doctors have had to do since pretty much forever. It’s not their fault: “We just didn’t have the technology,” Tomkowiak verbally shrugs.

But now we do, and a booming business in big data is driving change. Already, retail and advertising empires—Amazon and Alphabet, Google’s parent company—have been built on data collection nets beyond the dreams of even the most avaricious fisherman. That’s low-hanging fruit, though, compared to deploying massive data sets in healthcare. Far-seeing researchers are bringing together the technologies of data collection, the people who know how to wrangle and make sense of big data sets, and the practitioners of healthcare.

Currently, there is a shortage of data wranglers and analysts. Just in time to meet the needs of what could be a revolution in healthcare, WSU is bringing up to speed one of the few data analytics programs in the country. Under the direction of entrepreneur-scientist Nella Ludlow, the new program is training the bioinformaticists who will be the genetic counselors and consultants at Sullivan’s side when she graduates with her medical degree and enters the world of practicing health-care professionals.


Data Up

In one sense, data has always played an important role in medical research. As Tomkowiak points out, healthcare has long been premised on the collection of a handful of data points from one person at a time. They’re statistically compared with data collected from other individuals. From there, all sorts of conclusions are derived.

“When you interview a patient, you collect information about their history. We often say that history counts for about 90 percent of diagnosis,” says Tomkowiak. “Your physical exam might offer a few other data points. And that pretty much completes maybe the next five percent. You can order some different tests which might confirm or deny your diagnosis. So you’re dealing with literally a handful of data points on which you’re making diagnostic and treatment decisions.

“Contrast that with the vision of billions of data points” collected from each individual and being used to not just cure disease, but to prevent the transition from wellness to disease in the first place.

Researchers across the sciences struggle with data storage and management on unprecedented scales, with issues of data ownership, transparency, privacy and security, as well as how to actually turn all that information into actionable knowledge.

Most of that, Tomkowiak says, is not the problem of future MDs. “We’ve come to the conclusion that we want our physicians of tomorrow to have some of the qualities that are going to persist and be needed. Qualities such as compassion, the ability to communicate clearly with their patients, to be patient-centered and to be great listeners. To appreciate the whole person and understand how factors such as their environment and their socioeconomic status and their education all play into their health and wellness. At the end of the day, we don’t think they can be data analysts.”

WSU’s future docs do get some training in how data is collected and analyzed, but they get a lot more training in teamwork. As Sullivan says, after watching a clinic full of professionals go through their day providing care, “they flow as a unit.” That unit cohesion is critical to the core mission of doctors. Data analysis isn’t—but that’s where Arivale comes in.


The Precision Factor

The shift from basing diagnostic decisions on a few dozen to billions of data points was made possible by bringing intense computing power and rapid sequencing of genomes to biology. Lee Hood, a pioneer of what is now called systems biology, in which the interactions of complex biological systems are modeled mathematically, was once told he could have the use of a dinky desktop computer for his project. “You’ll never exceed the capabilities of this machine,” he was told.

So Hood went his own way. He gathered a team and founded the Institute for Systems Biology (ISB) in Seattle, a multidisciplinary group of researchers bent on overturning the biomedical status quo by discovering how all the parts of an organism interact to make a living being. And you can bet ISB has some major computing horsepower.

One of the most exciting aspects of ISB’s research program is in detecting changes in an individual’s state of wellness and catching disease onset in the act. That requires what Hood calls “dense phenotyping.” Your phenotype is, well, you: the expression of your genetic inheritance in the environment in which you’ve developed and grown.

Hood is emphatic that we’ve spent way too much time and money on studying diseases and searching for cures when what we should be doing is figuring out an individual’s state of wellness. If you know what makes you well, then any deviation from that state can be eyed as a possible “transition to a disease state,” Hood says, and preventive action can be taken to halt that progression.

“That’s the preventive medicine of the twenty-first century,” he says. By stopping the progression from wellness to disease, Hood argues, we’ll save lots of money on healthcare. And we’ll save lots of
lives, too.

Hood mentions Alzheimer’s, a disease that costs Americans about half a billion dollars a year. He says that out of 400 clinical trials that tested treatments for Alzheimer’s in the last dozen years, there have been zero drugs developed. The fundamental flaw, he argues, is that statistics-based trials assume that all individuals are identical—an obviously false assumption that we only now have the tools to correct.

That’s the paradigm shift: from statistical studies that take a few data points and extrapolate from them to an entire population, to empirical data collection that gathers lots of data from individuals.

Arivale, the consumer extension of ISB, is also teaching WSU medical students that behavior is the biggest challenge to maintaining wellness. It’s easy enough for your doctor to tell you that you need more exercise, or to eat more vegetables, or to get more and better sleep. But, on our own, it’s hard to implement those instructions.

That’s why Arivale advocates for the use of health coaches as the main patient-facing member of a health-care team. Jennifer Lovejoy, Arivale’s chief translational science officer, says, “The physician knows what they want their patient to do—lower cholesterol, lose weight—but they don’t have time to be meeting with the patients as regularly to provide the support and information they need, but the Arivale coaches do. The physician continues to do the follow-ups. So it can really be a brilliant collaboration between the physician, the coach, and their patient.”

While adding personnel to a health-care team at first seems like it would add to the cost of care, it’s more likely to improve efficacy through better communication and outcomes. As Lovejoy says, “Shockingly, systems thinking is still new to biology and medicine. Today, if a patient has Type 2 diabetes and cardiovascular disease, which is an extremely common pairing, they’ve probably got at least three doctors. And there may not be great communication between those three doctors in managing this condition because that’s the way training, historically, has been siloed.”

Lovejoy is passionate about transforming the current health-care system. “I think it’s pretty obvious to everybody that we have a broken system,” she says. The United States spends far more on healthcare per person than any other country in the world, and with far fewer positive outcomes. Patients, currently, are passive recipients of cures; a computational and team-based approach to medicine makes the patient a participant who takes an active role in the promotion of their own well-being.

There’s just one hitch: most of us don’t do big data, and wouldn’t know an allele if it reared up and bit us. But it’s in the genome, with its themes and allelic variations, where the risk factors for long-term disease conditions get their start. What we need, along with a crop of fresh-thinking health-care practitioners, is an influx of big data analysts.


The Pattern Game

Nella Ludlow’s been a fighter pilot, an artificial intelligence researcher, an entrepreneur—and now she’s the director of WSU’s new data analytics program. Just in its second year—paralleling the new MD program in Spokane—Ludlow’s students are getting jobs as fast as they can get their degrees. She mentions a couple of juniors who got internships with a company that analyzes low-altitude aerial photography for insurance companies wanting to make sure they’re not being defrauded. Post-internship, the students were offered part-time jobs for their senior years in college—and full-time gigs as soon as they graduate.

Part of that success is down to Ludlow herself: she’s got a long track record of partnering with industry. But, she says, it’s also due to a huge demand in every industry sector. “Almost any process that you can collect data on, you can analyze to see how to optimize it, make it safer, cheaper,” she says. “It can literally save companies millions, so they’re willing to invest—which is one of the reasons we are short of data scientists.”

The basic idea is pretty simple: you train a computer to look for patterns in data that might signal something interesting. For instance, you might analyze genomic data to see if people with schizophrenia have an allele, a variant of a gene, in common, one that nonschizophrenics don’t share.

“The first part is to find the needle in the haystack, and once we see the correlation, that could be a clue as to where to search next,” Ludlow says.

Such studies are taking place every day, Ludlow says. Called genome-wide association studies, which look for correlations between a disease and a genetic factor, they produce massive amounts of data. Such studies are only one way of collecting health-related data, though. WSU researchers are pioneering ways of using social media to monitor disease outbreaks, and developing wearable electronic devices that monitor blood pressure, glucose levels, and many other factors that, when they change, signal a possible health problem.

It takes a certain kind of person, Ludlow says, to train a computer to spot potentially significant patterns. Women seem more drawn to data analytics than to computer science, possibly because the job does not begin and end with coding.

“What we’re hearing from students who are drawn to data analytics,” Ludlow explains, “is that I like a little computer science, a little bit of math, a little business, a little machine learning, but it’s all glued together and I get to be the translator and work with people. That’s really what it is: ‘Look at this cool pattern I found! And here’s how you can use it.’ You have to communicate; you’re not just writing code.”

The field is also dramatically interdisciplinary. Sixty-two faculty members currently have appointments in the data analytics program, Ludlow says, drawing on fields as disparate as soil science, economics, health sciences, biology, physics, computer science, math, business, and “all the AI people who do things with machine learning.”

But with all this data floating around, what about security? What about privacy? Tech companies like Google and Facebook commodify and sell user information, so one wonders if we have private lives anymore. That extends to our health and genetic


That’s Private!

Tom May, a medical ethicist who works with the health-care faculty at WSU Spokane, recently wrote a New England Journal of Medicine paper about the “Wild West” of direct-to-consumer genetic testing. Notoriously, genomic data uploaded to public sites was used to track down the Golden State Killer, a serial murderer and rapist in California. By comparing crime-scene DNA, investigators were able to obtain close matches to the killer’s relatives. From there, it was a matter of working through the family tree. That’s a big win for criminal justice, but it’s also a chilling reminder of just how easy it is for strangers to access data. Even if you haven’t shared your data, a relative may have shared theirs. So you are, in a sense, sharing yourself without ever intending to.

Tomkowiak likes to counter that thought with another idea of medical ethicist May. What if data sharing is in fact the way to go? What if we didn’t keep our health secret and simply shared everything? It’s a provocative idea that’s alien to our culture.

But consider the case of Flint, Michigan, where a cover-up of lethal water quality killed at least 10 people and made many more dangerously ill. If health status had been shared and communicated, that cover-up never could have occurred. “We could make inferences about our environment because it might be easy to see that everyone who lives in a certain geographic area all had the same health-care issue,” Tomkowiak says.

The fact that the Flint disaster occurred at the interface of the environment and human health is significant because environmental scientists are pack leaders when it comes to thinking seriously about how data can and should be shared. And so closely intertwined are the health of the environment and that of humans (and all life) that researchers, at WSU and elsewhere, have adopted the term “one health” to describe their efforts to systematically understand how the one interacts with the other.

Like a lot of medical research, environmental scientists contend with nonreproducibility. That is, they observe something happen—the effects of, say, phosphorus moving off farmland into the water system—and measure and accumulate considerable data. It’s not an “experiment” in the sense that it can be reproduced.

Conclusions drawn from data collected in nonreproducible contexts are often challenged, as they should be. The solution, write WSU environmental scientists Stephanie Hampton and Stephen Powers, is to make the raw data, and the software used to analyze the data, public as part of the publication process.

Hampton, the director of the Center for Environmental Research, Education, and Outreach at WSU, along with researcher Powers, point out that there are concerns with total transparency. If, for instance, the location of an endangered species were revealed, ecotourism might compromise that species’ environment. Or revealing the presence of a valuable resource in a fragile environment might likewise cause irreversible damage.

So, too, with medical information. A malefactor with access to sensitive genomic information might concoct phishing scams to sell snake-oil cures to vulnerable people at risk for any number of diseases.

Hampton, Powers, and May all agree that data stewardship, a field undergoing tumultuous change, needs lots of conversation and scenario modeling to answer tough questions about privacy, security, and who owns what data.


Prevention is priceless

Tomkowiak is adamant that patients need to be the owners of their medical data.

“Right now, our health-care system is set up so patients don’t own their data. It’s the providers, the health-care systems, sometimes the insurance companies, that own the data.” The current health-care system is designed to sell us cures, so that ownership arrangement makes a kind of sense.

But, Tomkowiak continues, “I think that’s going to change. As we move into computational medicine, the only way it’s going to work is if patients own their own data. And if they own their own data, it provides the opportunity for those patients to say, ‘I want to share my data.’

“I think the more patients who share their data, they’re going to see benefits and that could change the way we think about how and why we share data. And it may allow us to build in protections so we can do more of that.

“But until we address the fundamental issue of who owns data, I don’t think any of this other stuff will change. That’s one of the things we’ve been talking about as a medical school: how do we advocate for patients to own their own data?”

A shakeup of data ownership and the current emphasis on selling cures could fulfil the promise of computational medicine, so we’ll be able to better prevent the transition from health to disease.

While a few American universities are busily retooling their medical school curricula to train future practitioners in computational medicine and deep prevention, WSU’s new medical school was founded on the idea that teamwork and an appreciation of complexity at both the individual and the public health level are essential to the future of medicine.

Sullivan and the other WSU medical students enrolled in the Arivale program, Hood says, “will learn what scientific wellness is and will learn how to analyze their own data in really interesting ways.”

Having big data, Tomkowiak agrees, “means we can focus way more on prevention, on scientific wellness, than we ever have before in the history of medicine. Partnering with Arivale and the Institute for System Biology is leading us to the future of healthcare—the question is, just how fast do we get there?”

Related story

DNA autoradiogram in a petri dish (Photo Rafe Swan/Alamy)Genomics fills a gap for adoptees