Problem Solved: How we built a healthcare data platform that produces accurate healthcare insights

by Vinod Subramanian

2022/11 6 minutes read

Social Media Blog Graphics 8

It goes without saying that healthcare data is incredibly complex. The data you need to understand any given patient’s journey might exist in dozens of places, from labs to physician reports to mortality records. The data is often structured, semi-structured and unstructured—i.e. hand-written notes—and across healthcare there is a great deal of inconsistency in referring to everything from lines of therapy to names of medications (did you know aspirin could be documented 3+ different ways?). 

Most data platforms are simply not designed to handle this level of complexity, much less the HIPAA compliance and privacy issues layered on top of every healthcare interaction. 

In this three-part blog series, I’ll discuss the critical questions we needed to understand and solve as we built Syapse Raydar — our healthcare data platform designed to produce accurate healthcare insights that power the ability for life sciences, health systems, and researchers to improve patient outcomes. 

What’s the difference between a data platform and a healthcare data platform, and what makes the latter relevant for real-world data? 

All data platforms provide the necessary infrastructure to extract and transform vast amounts of information. And in a world where the sheer quantity of data is growing exponentially, these platforms are having a tremendous impact in how we live, work and interact with one another.

But healthcare data—specifically real-world data—is defined by its own unique challenges and goals, including the advancement of patient care and preserving human life. To that end a healthcare data platform should focus on delivering data in a way that’s highly accurate, immediate and relevant to those who need it, when they need it—oncologists, clinical trial coordinators, life science researchers, and so on. 

A healthcare data platform must deliver RWD that's highly accurate, immediate and relevant to power the advancement of patient care and to preserve human life.

Because data in a healthcare environment is often derived from multiple sources, hardly any of which adhere to the same standards, architecture or data definitions, interoperability is a critical component of any healthcare data platform. The ability to ingest and aggregate data from any source, in any format, helps reduce the workload on IT resources while enabling health systems and life science companies to make use of all the data they have at their disposal, in order to derive actionable insights and improve outcomes. This includes structured data but also semi-structured and unstructured data such as hand-written physician notes and lab reports, which can only be accessed through the use of highly advanced Machine Learning (ML) and Natural Language Processing (NLP) tools. 

Now, let’s talk about aligning vastly different data definitions. A dedicated healthcare data platform needs to include both standard and customized ontologies and terminologies to handle complex and often inconsistent real-world data (John P. Doe vs. Doe, John P, for instance), while accelerating its transformation, standardization and normalization on-demand, without reprocessing or requiring additional storage. When a typical patient journey might include visits to a number of labs, physicians, testing facilities and so forth, handling such discrepancies is an absolute necessity in the real-world data space.  

With regards to compliance and patient privacy, healthcare is similarly unique in how data is treated and disseminated. A platform designed for healthcare must treat HIPAA and other privacy standards not as an afterthought but as integral components of the technology matrix. Information security alone is not nearly enough. 

Finally, any healthcare data platform must be in alignment with and facilitate FAIR data principles, which are about making data Findable, Accessible, Interoperable and Reusable. 

In summation, a healthcare data platform needs to offer more than the infrastructure to extract, transform and load data. It must be designed with a focus on advancing patient care; it must have robust interoperability; incorporate ML and NLP at scale; have the ability to standardize and normalize inconsistent real-world data; and contain built-in (not bolted-on) compliance and patient privacy safeguards. 

For health systems and life science companies, the use cases are infinitely diverse - how does a healthcare data platform adapt to that? 

Traditional platforms are basically data repositories and infrastructure. As I touched upon in the previous section they are designed to extract, transform and load data—the “ETL” model. 

In real-life care, the variables in how data is used to answer questions and optimize outcomes are virtually infinite in scope. There’s simply no way to transform data in advance to satisfy every conceivable use case scenario. So we need to imagine a new model for healthcare data, where information is extracted, loaded and transformed on-demand. This is the “ELT” approach, which healthcare data platforms need in order to adapt infinitely to any use case in any healthcare setting. 

We need to imagine a new model for RWD, where information is extracted, loaded and transformed on-demand in order to adapt infinitely to any use case in any healthcare setting.

ELT can revolutionize how life science companies and health systems access and use data to advance patient care, because the original source material is always accessible in its native state. So if a researcher or physician needs to reevaluate an assumption, ask new questions or establish a different set of benchmarks, it’s relatively easy to go back to the raw data and transform it based on whatever criteria are established. 

For a moment, let’s consider how this works for clinical trials. Actionable insights are derived from on-demand access to high-quality data. It’s important to identify the right candidates for trials at the right time in their patient journeys. Patient matching, data recency, reliable mortality composite scores and a host of other factors lead to clinical trial results that can be trusted and acted upon. This can only happen with a data platform engineered for the unique needs of healthcare. 

It really goes back to the concept of intent—what do we ultimately want data to help us achieve? In healthcare, the intent is preserving human life and easing the fear and burden of serious disease. A data platform that can help us answer questions others cannot, and also formulate new questions, is a powerful tool in the realization of that intent. 

Join me for part two of this series, where I’ll dive deeper into making sense of all the disparate data through interoperability and knowledge management services, along with how AI is playing a central role in the future of care innovation and delivery.