Big Data on Patients

Big Data, capitalized for emphasis and hype, is all the rage in every industry. Healthcare is no exception. Companies and investors are betting big on the power of data. Healthcare produces a lot of data that is growing exponentially, so it’s ripe for reaping the benefits of its unification and analysis.

I’ve heard a lot of descriptions for Big Data. The most generic seem to be technical. “Too massive and unstructured for standard databases” literally describes Big Data, but my favorite description came from a guy I met recently who  worked in defense, built a big data product with Hadoop that got acquired, and is now focused on the financial sector. He presented the real power of Big Data to me this way: “Big Data offers data about a subject in multiple dimensions. The more dimensions you have on the subject, the more powerful the Big Data and insights are on that subject.” I would never have been able to make such a simple analogy, but it makes Big Data perfectly simple to grasp.

For this post, I’m concerned with the patient or consumer as the subject. There are other subjects that can gain from Big Data in healthcare. Hospitals, systems, providers, and others can all gain insights from combining data and adding dimensions.

The big data source in healthcare today is the EHR. I take the Blue Button CCDA to be a pretty comprehensive (and pretty clunky) patient-centric view of an individual. CCDA is far from perfect, but it’s the best that is out there right now for patient-centric data exchange. It’s just a snapshot.

Moving forward, there is a lot of interest in FHIR as well. I think FHIR is a few years off, though some of the leading work on CCDA is already started to incorporate aspects of FHIR.

For CCDA, I look at both the CCDA Clinical and CCDA Financial standards. CCDA only includes a subset of what is in the EHR, but it contains most of the relevant patient information. You could, if you combined a lot of CCDAs, construct a historical view of a patient at specific points in time, but on its own it lacks much or all of the transactional history that is in an EHR like Epic. Since this was the assumption I started with, I’d be more than happy to learn what relevant areas of the EHR I’m missing if I only look at the CCDA.

If you haven’t spent time on the Blue Button Plus website, I encourage you to check it out. They’ve done a very good job with it. It’s much easier to navigate than HL7’s site or any other CMS/ONC site I’ve visited. It breaks out the sections of a CCDA. In the words of the site, "The Consolidated CDA is a XML-based standard that specifies the encoding, structure, and semantics of a clinical document." There are more things you can insert into a CCDA, but Blue Button CCDA covers a good range of patient-centered data. Below are the various sections. I think of these as the clinical dimensions of a patient.

  • Patient Info (MRN, member number)
  • Provider Info
  • Allergies
  • Encounters
  • Immunizations
  • Medications
  • Care Plan
  • Discharge Medications
  • Reason for Referral
  • Problem List
  • Procedures
  • Functional and Cognitive Status
  • Results (Labs)
  • Social History
  • Vital Signs
  • Discharge Instructions
  • Claims
  • Health Financial Amounts
  • Wellness and Care Management Programs Alerts

That’s a long list of data, or dimensions, about a patient. It’s mostly historical data — upcoming appointments or procedures are not included unless they are part of the Care Plan section of the CCDA.

The historical nature of the data is relevant because proactive engagement is essential. That said, there are ways to do customized, proactive alerting from some of the CCDA data set. You could use medications to do reminders and maybe refills. The problem list could be used to provide customized educational messaging. Vaccinations could be used to trigger alerts to keep people up to date. Information from results and vital signs can be inferred for necessary follow-up.

Much of this value is predicated on the CCDA being a complete picture of a patient. Since it’s a snapshot in time from one provider or payer, it’s unlikely to be a complete record. It’s also important to realize that many of the limitations with the Blue Button CCDA are a result of its purpose — a tool for patients to access their own health information.

Looking outside of Blue Button CCDA, which again I use as a proxy for current healthcare data collection, other data sources are relevant as dimensions representing an individual:

  • Activity. This is a big and obvious one. I think everybody is betting on steps per day, miles run, and sleep from device like Fitbit, Jawbone, Nike, and Misfit Shine. This is extremely powerful data if (a) you can get people who aren’t buying these devices today to wear them; (b) you can give meaningful feedback, and maybe with games and social. Some of this exists today, but it won’t be a “one size fits all” solution; (c) you can incrementally create sustained behavior change.
  • Patient-reported clinical data. There’s overlap here with clinical data. Home recorded weights from Withings scales, blood pressure readings from iHealth, glucose readings from Telcare, and a host of others. There is a need to figure out how to differentiate patient reported from clinically entered, as I think its a distinction that matters, for no other reason than liability. It’s probably important to distinguish between patient entered and device generated, even if the device is patient owned. I’ve been told that EHR vendors will never open up some of the traditional protected fields they use to store this type of clinical data, so EHRs will have to create new fields if that is the case.
  • Mobile usage. This is a fascinating area to me, one that is tackling. The ways in which people use phones and the deviations from those patterns are potentially powerful dimensions about people.
  • Internet usage. The types of health related searches and questions people ask on platforms like Healthtap, WebMD, and Twitter are extremely relevant.
  • Genetics. Another huge data source I mentioned in my last post. I’m far from an expert, but this data seems very complex to really integrate it into an existing EHR. Anything more than a lightweight HL7 interface would be a big challenge, and I’m not sure how much value you get from an interface and not true integration.
  • Social. Factors such as location, food insecurity, housing concerns, relevant community anchors, and a bunch of other factors I’m missing have a large impact on health and treatment compliance. I’m not sure how you model these social factors, but they are all relevant dimensions.
  • Financial. This area ties pretty closely to social factors, but is worth calling out because it does play a role in decisions about healthcare and treatment.
  • Education. Level of education more specifically. This is another significant factor relating to health and wellness of individuals and families.
  • Family history. I’ not sure this is captured in a CCDA, at least in the currently proposed Blue Button version. This is extremely relevant for risk factors, segmentation, and targeted interventions.

Obviously I don’t think all of these areas are ever going to be accessible for privacy reasons. Logically to me, however, it makes sense as a starting point to outline all the different dimensions and then narrow down from there. All of these dimensions add potentially powerful context to the overall view of a patient. One interesting point to note is the amount of relevant data and the different owners that exist for that disparate data.

Dimensions of data that help to segment and create targeted interventions are valuable for health and wellness. We take a very narrow view of healthcare data, as we do with healthcare  generally. The tools in place such as EHRs fit that same narrow view, which makes sense as they were built for it and built to bill for the encounters that generated that data. That narrow view is not going to drive down costs or engage patients in the health and wellness.


Travis Good is an MD/MBA and co-founder of Catalyze. More about me.

↑ Back to top

Founding Sponsors

Platinum Sponsors