Find out how we can help you tackle your healthcare challenges.

Four Questions to Assess the Data Blog

Four Questions to Assess the Data That Will Drive Your Next Big Decision


Data integrity is foundational to the work of health-related enterprises as they seek to develop new treatment options, close gaps in care, and ultimately, improve health outcomes. It’s the insights gleaned from data that guide the investment of billions of dollars in research every year, and by extension, drive ensuing business decisions. Thus, the success of pharmaceutical and medical device launches — as well as population health, value-based care, and patient navigation initiatives — is largely determined by the data these organizations are using.

Unfortunately, that data is often flawed.

The Challenge of Obtaining Reliable Data

The healthcare technology industry has made tremendous progress in capturing data for myriad uses such as assessing disease incidence, identifying treatment patterns and boosting clinical trial patient recruitment. Still, there are significant weaknesses related to creating reliable datasets. Four of the most common pitfalls of legacy data aggregators include:   

  • Bias – If data doesn’t represent or reflect the entire population to be studied, there is inherent risk a certain segment of society will be overrepresented and the data will over-index in specific characteristics. For example, if data sources exclude age groups, geographic locations, ethnicities, or income levels, then the analytical insights will be skewed, which can in turn impact study outcomes. 
  • Incomplete views – The wealth of information contained in medical claims and their longitudinal nature make this data source both rich in insights as well as helpful in piecing together patient journeys. But sourcing claims from a limited number of payers, such as through a single-claims processor, is a faulty approach that doesn’t produce a comprehensive, reliable dataset. Additionally, relying on any single data source — whether closed medical claims, EMRs, or pharmacy billing systems — impedes the ability to develop an accurate, informed view of the target population. 
  • Stagnation – Not only are the number and types of data sources continually expanding, but the data within each source can also evolve daily. Access to real-world, real-time data via open claims and non-traditional sources is essential for ensuring relevance.
  • Redundancy – Since an individual’s insurance coverage often changes multiple times within the span of a few years, care may be sought from various providers in varied venues. The number of individuals purported to be included in a dataset may be inaccurate if an organization is generating its own de-identified tokens. 

Ensuring Data Integrity

Healthcare stakeholders can ask four key questions to assess data reliability: 

  1. Does it reflect socioeconomic diversity?  Claims data should include numerous commercial payers sourced from more than one claims processor (clearinghouse) and both Medicaid and traditional Medicare. Confirm that the percentages of payer claims are representative of the  patient population and that they are balanced across geographic regions.
  2. How many nontraditional sources are included? Ensure that data from laboratories, retail clinics, pharmacies, patient registries, and other nontraditional sources will be used to help fill in the patient journey and to enable deeper insights.
  3. Is it capturing the complete patient journey? Many datasets are based on either closed or open claims data — but fail to incorporate both. Closed claims, alone, won’t capture recent encounters; conversely, open claims only deliver a snapshot in time — not a longitudinal view. Make sure the dataset includes both closed and open claims and nontraditional data sources. Care team interactions are another significant source of insights — not only capturing the activities of doctors prescribing treatments, but interactions with nurses, physician assistants, pharmacists and those who support care decisions offer critical views into the patient’s journey.
  4. How is the number of unique lives validated? PHI-based tokens administered by a third party is the objective approach to verifying the number of patient journeys represented. 

Answers to these questions will provide insight into the data’s comprehensiveness and reliability, enabling you to either gain confidence in your data partner or search for a new one. 

The Data Makes the Difference

With data reliability being so crucial to the work of pharma, medtech, health insurance companies and patient advocacy organizations, it makes sense to investigate what’s possible and available in the industry, and to then evaluate the strengths and weaknesses of prospective healthcare data partners.

Learn more about the clinical data integrity of Komodo’s Healthcare Map by watching our video.

To see more articles like this, follow Komodo Health on Twitter and LinkedIn, and visit Insights on our website.

By providing your email address, you agree to receive marketing communications from Komodo Health. For more information on how we process personal information, please refer to our published Privacy Notice.
Recent Stories