Predictive Alerting: Leveraging Novel AI Approaches for the Early Detection of Patients With Rare Disease
Authors: Jeremy Watkins, MS, Data Scientist; Ben Cerio, PhD, Staff Data Scientist; Wissam Siblini, PhD, Senior Machine Learning Engineer; Marc Carmichael, PhD, Clinical Advisor; Vijay Divi, PhD, Head of Data Science.
Komodo Health engineers are using machine learning to accurately identify rare disease patients earlier, potentially preventing adverse outcomes.
We often say that identifying a rare disease is like finding a needle in a haystack. This analogy has gained new meaning over the past decade, as the volume of healthcare data has grown exponentially, opening the door for new approaches to detect rare diseases, but also creating more noise to filter out.
Patients with a rare disease frequently go long periods of time, sometimes years, before they are correctly diagnosed. This is often due to the variable presentation of these diseases, which makes it challenging to identify patients through traditional methods. To accurately identify them at the appropriate time, the methods we use must be as varied as the presentation of the diseases themselves. One size does not fit all when disease characteristics, treatment patterns, and patient journeys are so diverse. This is why we’re developing tools that tackle the health data challenges of rare disease from all angles. At Komodo, we build alerting solutions for our customers using both prescriptive and predictive machine learning approaches.
More tools means more insights: predictive and prescriptive approaches
While prescriptive alerts identify patients using a set of concrete rules related to patient diagnoses, clinical procedures, and treatments, predictive alerts follow a data-driven approach that leverages a higher volume of contextual events from a given patient’s history. To do this, a machine learning model “learns” the relevant clinical signals that, due to the inherent variability of patient journeys, can be weak in isolation but highly predictive when combined. The model generates a probability, or score, indicating if a patient should be flagged.
Both alert types, prescriptive and predictive, can be configured to find undiagnosed or misdiagnosed rare disease patients and the healthcare providers who serve them. However, for diseases with diverse symptomatology or those that are difficult to diagnose, we have found that predictive alerts can often capture more patients at the specific points in their journey where intervention is needed. Moreover, in these cases, predictive alerts include fewer patients without the target disease, reducing the amount of “noisy” alerts that can be costly to our customers.
Building a predictive model
So how do we create a predictive alert? We first craft a rare disease patient profile by combining in-depth patient journey data from our Healthcare Map™ with specialized clinical expertise in the therapeutic area. Clinical encounter cues, ICD and CPT codes, and existing patient information are all translated and incorporated into the profile. The profile is then used to train one or more AI/ML algorithms of varying methodology. Models also incorporate a combination of relevant clinical inputs and correlational insights such as timing of treatments and specialties of relevant physicians, along with supplemental lab and demographic patient data. To ensure that the patterns learned by the model generalize to new patient populations, we validate the model using various methods popular in machine learning like testing with hold-out populations and k-fold cross-validation. When we're satisfied with the performance of the model, we have our clinical experts perform a quality control review to check its predictions, ensuring that they align with the clinical expectations. Hypophosphatasia: a predictive alerting example
We can demonstrate the benefits of a predictive approach by looking at a rare disease case study.
Hypophosphatasia (HPP) is a rare genetic disease that causes defective mineralization in the body, resulting in bones and teeth that are weak and prone to fracturing. It can be mild for some patients and fatal for others. HPP has been historically difficult for clinicians to diagnose, not only because of how uncommon it is — it affects about 1 in 100,000 people — but also because of its variable presentation. Because there’s not a dedicated ICD code for HPP, it’s also challenging to identify newly diagnosed HPP patients in healthcare data. Moreover, the key lab test result that flags it (a test result of low alkaline phosphatase) is shared with many more patients with other conditions. These limitations make it a good candidate for a predictive approach.
The stakes are high with HPP: Delays due to missed diagnosis and misdiagnosis often lead to complications, and early identification can save lives.
HPP: our approach
We wanted to capture both undiagnosed patients with a high likelihood of HPP and those with HPP who may benefit from additional treatment. Considering the ICD-10 code limitation and off-label usage of treatments, we used a pharmaceutical registry of known HPP patients to build profiles. Pre-diagnosis data from patient journeys was then incorporated for a downstream analysis and to create patterns for the model to learn. We also incorporated demographic markers like age and gender, and lab data in the form of alkaline phosphatase (ALP) test results, which can be an indicator of HPP, but also of other conditions. We then trained the model to differentiate true positive HPP patients from our defined negative cohort. The predictive approach had a precision rate about 120 times higher than the prescriptive approach, and captured more than three times the number true positives.
Our predictive model was able to capture 53 patients with undiagnosed HPP, who may have otherwise remained undiagnosed and untreated. We also ran a prescriptive approach on this case and it was only able to capture 18 true positives over the same time period. In other words, the predictive approach resulted in three times more patients being flagged (and likely receiving improved clinical outcomes) than had the prescriptive approach been used. Additionally, the predictive approach had a precision rate about 120 times higher than the prescriptive approach. A precision rate reflects the ratio of true positives to combined true and false positives, i.e., how efficient the model is at accurately identifying patients with HPP.
Ultimately, as engineers and scientists, we use whichever approach can reliably offer the most accurate and high-yielding results for our clients and for patients. Sometimes the predictive approach is the one that will identify more candidates for a clinical trial, capture more misdiagnosed patients in need of treatment, and set more patients up to receive care earlier than if we had used a prescriptive approach. With HPP, the predictive approach could allow Life Sciences teams to better identify undiagnosed patient populations and their providers, increasing the opportunities for intervention and optimizing future engagements. Ultimately, the innovative approaches we are building to derive new insights from millions of health records and patient journeys will help improve outcomes for patients with a rare disease, and their families.
For more on our approaches to analysis for rare diseases, check out “Insights Without a Code: How New Approaches Can Raise Visibility Into Ultra-Rare Diseases” or read about our work with 50+ rare disease advocacy groups through our partnership with the Chan Zuckerberg Initiative.