Find out how we can help you tackle your healthcare challenges.

Analyzing Data in Komodo’s Healthcare Map: Summary of Standard Methodology

Komodo Health delivers the most in-depth insights to healthcare stakeholders based on high-fidelity real-world data. This summary of standard methodology details the processes that are typically applied to data-driven insights that Komodo Health reports in our blogs, white papers, and the media. 

Data Source: The Komodo Healthcare Map

Data is derived from Komodo’s Healthcare Map™ and includes:

  • Healthcare encounters for over 330 million medically insured patients, including those on Medicare, Medicaid, or commercial insurance  
  • Encounters occurring between 2015 and present
  • Patient tokenization using Datavant to protect patient privacy, reduce risk, and ensure HIPAA compliance on any analysis
  • Data from both providers and payers in both open and closed claims
  • Longitudinal views of entire patient journeys 
  • Data on care delivered in network, out of network, and through specialists, urgent care centers, and retail clinics
  • In some cases, specialized data from Komodo’s MapEnhance™ solution for added nuance

Duplicate claims flagged across sources are resolved in the Healthcare Map, and multiple claims associated with a single provider-patient encounter are consolidated (e.g., a provider billing separately for two different procedures in the same day). Each encounter represents a single billable encounter between a patient and a healthcare provider/healthcare organization derived from medical claims. 

Uninsured populations are not included in the Healthcare Map, nor are patients being screened through practices that do not accept insurance, such as certain mental health providers. 
Komodo’s Healthcare Map does not include patients who are part of the Kaiser Permanente health insurance network in California.

Data Recency
Payer-complete or “closed” sources are typically available one month after a claim is made.
For open-source claims, data lag can range from daily to weekly.

Defining Patient Cohorts in Therapeutic Areas
Procedures, treatments, diagnoses, and symptoms are defined using standardized transaction codesets specified under HIPAA:

  • ICD-9/ICD-10
  • ICD-10-PCS
  • CPT
  • NDC

For therapeutic areas with a designated ICD code: Komodo’s data has broad coverage across rare, medium, and large therapeutic areas. 

For therapeutic areas without a designated ICD code: Proxy cohorts can be built using relevant codes and clinical inputs. 

TAs with high screening rates may result in patient cohorts that are larger than epidemiology due to the risk of misinterpreting a screening as a diagnosis. This can be adjusted by requiring a patient to have multiple relevant claims within a given time frame, ensuring a confirmed diagnosis for each patient.   

For analyses of subgroups, patients may be stratified by the following demographic categories: 

Sex and Gender

  • The gender of the patient as recorded on the enrollment record, date of admission, ambulatory service, or start of care
  • Current categories include: Male, Female, and Unknown

Patient age is defined using index dates. 

Location is available at the ZIP3 level for all U.S. states and is determined by the patient’s recorded address on a claim.

  • Methodological considerations:
    • When calculating state-based rates, we use population size as defined by the U.S. Census.
    • State rates are typically calculated per 100,000.

Race & Ethnicity (R&E)

  • R&E assignment is made primarily through information that is self-reported by the patient.
  • R&E data is available for approximately 65% of unique patients included in the Healthcare Map. 
  • Komodo’s R&E categories match those of the U.S. Census: 
    • Race categories: White, Black or African American, Asian or Pacific Islander, Other
    • Ethnicity categories: Hispanic or Latino

Methodological considerations:

  • Patients identifying as both White and Hispanic or Latino are included exclusively in the latter category in analyses that include R&E to minimize double counts. This is done by using a “White-Only” selector in the White racial category.
  • Certain TAs have more R&E visibility than others. To ensure that insights are reflective of a population and generalizable, the epidemiological distribution of a TA across R&E groups should be considered as defined in medical literature.
  • R&E rates are typically calculated per 100,000 and adjusted using R&E population data from the U.S. Census. 

Media Policy
Komodo routinely partners with major media outlets on bespoke and derivative analyses to explore trends in current healthcare practices, access to care, healthcare disparities, and more. It is Komodo’s policy to share analyses and findings derived from Komodo’s Healthcare Map, not patient- or source-level data, with external sources. If you are a journalist interested in collaborating with Komodo Health on a story that is intended to shed objective light on patient needs using high-fidelity real-world data, please contact