The Big Data Engines Begin to Hum

11-5-2013 12-02-11 AM

For the past several years, as Meaningful Use has rolled out across the US, hospitals have been operating in an environment where total revenue is decreasing and clinical informaticists are scarce. As a consequence, EHR strategic planning has resembled a game of whack-a-mole – making rushed decisions to address today’s problems, knowing full well that they’ll cause a host of new problems to pop up down the line.

Some hospitals are sticking with legacy systems that are limited in functionality but are familiar to support staff and are affordable, others are starting from scratch, bringing in new vendors with superior, but foreign, new technologies. Many are finding themselves somewhere in the middle, making due with a patchwork of best-of-breed systems and legacy systems quilted together in a delicate balance that delivers little value beyond MU compliance. It’s a frustrating dynamic for all involved, but mostly for end users, especially considering the minimal impact EHRs have had on clinical or financial improvement thus far.

Enter the heavily discussed, but rarely utilized field of “big data” in healthcare. Thus far, big data research has had no appreciable effect on patient outcomes, clinical research, or healthcare costs. However, this is no reason for big data skepticism. Outside of healthcare, researchers working in big data have proven themselves to be incredibly capable when it comes to taking large swaths of information and squeezing out valuable knowledge.

  • In 2000, when the Sloan Digital Sky Survey project began mapping the universe via a wide-angel optical telescope, researchers collected more data in the first few weeks of the program than had been collected in the history of astronomy. The images collected were pieced together, distances and relationships between the objects discovered were calculated, and as a result the first-ever large-scale structural representation of the universe was born.
  • At CERN, researchers are capturing and analyzing data from the Large Hadron Collider which generates 600 million particle collisions per second. The data is recorded and stored in the CERN data center until it can be analyzed. Each collision is investigated as computers search for anomalies within the collisions. This process resulted in the discovery of the Higgs boson in 2012.

Until recently, there were very few examples of big data projects, other than the Human Genome Project, working at the intersection of big data and healthcare. It was an inherently difficult environment to operate in. Paper charts limited the range of data that researchers could target with their supercomputers. However, with the introduction of massive amounts of clinical data from EHRs, a new infrastructure and market now exists that is prime for big data researchers. Those researchers have shown up, in droves, and the results are beginning to see the light of day.

Big Data in Clinical Research

Last month in Sweden, researchers published initial results from a new post-market clinical trial process that uses EHR data and big data algorithms to validate pre-market claims in the field. Researchers are using algorithms to pour through EHR data to verify that outcomes from newly approved medications or procedures match what was seen in pre-market clinical trials. The observational-based approach is far cheaper, faster, and more representative, that traditional controlled clinical trials.

Researchers using this new research methodology published results in the New England Journal of Medicine last month questioning the use of thrombus aspiration, a procedure for removing clots from the coronary artery, in certain heart attack patients prior to percutaneous coronary intervention. Thrombus aspiration has been a popular treatment for certain heart attack patients since a 2008 study linked the procedure to improved outcomes, resulting in the American Heart Association and the American College of Cardiology recommending it as an appropriate intervention. The researchers found that actual outcomes in the field showed no difference in 30-day all-cause mortality rates for patients who had received thrombus aspiration versus those that had only received PCI alone.

Because patient records were digitalized, and big data researchers had created tools to look for abnormalities in observed outcomes, this information was discovered far faster than it otherwise may have been.

Big Data in the Hospital

Intermountain has recently launched a new big data service, in conjunction with Deloitte, called PopulationMiner. The big data analytics engine is the second analytics product coming from the Intermountain/Deloitte partnership. The first, OutcomesMiner, focused on helping hospitals understand how a wide range of factors might have contributed to the outcomes of a specific patient.

PopulationMiner approaches big data from a broader perspective, leveraging the clinical data from 2.5 million patient records to help hospitals understand the cause-and-effect relationships affecting the outcomes of larger populations.

The tool is not designed to be used as a clinical decision support system. Instead, it is being marketed to hospitals as a “comparative analytics platform” that will provide health systems with benchmarking capabilities and allow them to identify areas within their organizations ripe for improvement and potential strategies to drive those improvements.

Big Data at the Bedside

As reported here earlier this month, IBM has announced two projects that it is spearheading with Cleveland Clinic that will attempt to validate the clinical decision support tools developed within company’s now famous supercomputer, Watson. 

Watson is a pizza box-sized, artificially intelligent supercomputer created by IBM that uses natural language processing, a database of more than 200 million pages of structured and unstructured content, and sophisticated algorithms to solve complex problems. Watson’s database includes 600,000 pieces of medical evidence, 1.5 million patient records, two million pages of text from medical journals, 25,000 medical training cases, 1,500 lung cancer cases, and nearly 15,000 hours of clinician-led fine tuning of its medical decision accuracy.

Over the next three years Watson will have the opportunity to support diagnostic decision-making at the bedside while its algorithms are further refined. The goal of the program at Cleveland Clinic is to build a digital assistant that can present doctors with key information from within a patient’s medical record and propose a diagnosis based on probability and science.


Big data researchers have mapped the universe, as well as the genome with incredible efficiency. Their math is behind the algorithms that allow researchers to run weather modeling scenarios that calculate cause and effect across a global footprint. They have even leveraged Google search data to accurately predict fluctuations in the stock market.

Over the past several years, hospitals and practices have been implementing EHR infrastructures across the nation, not without more than their fair share of pain points. Still, the byproduct of this effort is a constantly-growing national asset: the data. As more hospitals cross that “go-live” threshold, more data becomes available, and in turn more evidence will be able to be extracted, analyzed, and repurposed to help guide us on the road to cost reduction and quality improvement. 

Enjoy HIStalk Connect? Sign up for update alerts, or follow us at @HIStalkConnect.

↑ Back to top

Founding Sponsors

Platinum Sponsors