Skip to main content

Advanced Search

Advanced Search

Current Filters

Filter your query

Publication Types





How Health Plans Could Improve the Collection of Race and Ethnicity Data

overhead shot of people walking above graphic lines
  • Collection of accurate, comprehensive health data by race and ethnicity is fraught with challenges, including lack of standards, multiple sources, and outdated or incomplete information

  • Efforts to improve collection of health data by race and ethnicity should preserve patients’ autonomy and privacy and ensure community engagement

To work toward racial equity in the health care system, participants — health plans, physicians, and the government, to name a few — need to understand where disparities exist and how they persist throughout the continuum of care. However, collecting data remains difficult; there is a dearth of standards for how health plans should collect these data and a lack of standards for recording and reporting them. Currently, health plans collect data on race and ethnicity in a variety of ways: from government or private payers, through interactions with plan members, from patients’ clinical records, and by attributing race and ethnicity based on name and place of residence.

Unfortunately, the quality and availability of race and ethnicity data in our health care systems vary widely.

The National Committee for Quality Assurance conducted a targeted environmental scan and interviews; health plans shared their challenges with race and ethnicity data collection and opportunities for improvement. The following themes emerged:

  • Categories for documenting and reporting race and ethnicity. Standard race and ethnicity categories can support collection of accurate data that can be aggregated for reporting, as recommended in Improving Data on Race and Ethnicity: A Roadmap to Measure and Advance Health Equity. Questions remain regarding how best to aggregate categories for people reporting more than one race or ethnicity.
  • Opportunities to improve government enrollment data. Government enrollment files (i.e., for Medicare and Medicaid) are a consistent source of race and ethnicity data but can be inaccurate across some categories. For example, data on American Indians/Alaska Natives and Asians/Pacific Islanders tend to be missing or incomplete because of a lack of collection or miscategorization when data are collected. In addition, monthly updates to Medicaid files typically overwrite information collected by plans.
  • Consent and transparency. Public and private plans should develop guidance for how race and ethnicity data could be shared between plans and providers. Such guidance should include the permissions that will be needed, whether providers can share data with health plans, when members need to provide consent for data sharing, and how to protect members’ autonomy over their data.
  • Reconciling multiple data sources. Given the inconsistencies of race and ethnicity data across sources, it can be difficult identifying the correct source when a member has conflicting information. This can happen when plans use multiple sources or sources of uncertain validity or when there are differences in how members self-identify over time when race and ethnicity categories vary.
  • Frequency of data collection and verification. Plans are unclear about how often members should be asked to provide race and ethnicity information; how often members should update or confirm existing information; and how often members who declined to answer or did not previously provide the information should be asked again.

Health plans use a range of sources to increase the availability of member self-reported race and ethnicity data, which is generally considered to be the gold standard. Yet, the amount of missing data and logistical barriers, like limitations to data exchange or member turnover, means that imputation also will be used as the field works to increase the availability of self-reported data. Focusing only on self-reported data has the potential to delay or set aside equity efforts, and focusing only on imputation methods may blunt the urgency of implementing efforts to obtain self-reported data. This suggests that deploying the two methods can fill in some blanks in the absence of standardization.

Recommendations for the Road Ahead

The use of multiple data sources raises questions about how best to support members’ understanding and control over how their data are shared and used. Efforts to improve collection of race and ethnicity data should be grounded in respect for individuals and preserving their autonomy and privacy, with a focus on allowing self-identification, achieving equal outcomes, and ensuring community engagement.

Opportunities for government agencies and health plans to address the most pressing challenges to improving race and ethnicity data availability include:

  • Specifying a set of use cases that explain how race and ethnicity data can be used, and that stipulate permissible and acceptable use from the perspectives of health care entities, patients, and community members. Use cases should include care delivery, population management, quality improvement, patient safety, equity reporting, and accountability.
  • Coordinating a diverse group of stakeholders to develop guidance for implementing interoperability standards that support the collection, use, and sharing of electronic race and ethnicity data for equity reporting. Currently, data collection and exchange standards follow frameworks such as the Fast Healthcare Interoperability Resources and United States Core Data for Interoperability. Stakeholders can convene to identify how these existing frameworks can provide better standards for race and ethnicity categories.
  • Improving trust and confidence in imputation methods and data. A federal agency (e.g., the U.S. Department of Health and Human Services, a multiagency working group) and entities such as the National Academies of Science, Engineering, and Medicine should convene a panel of statistical methodologists, health care decision-makers, community members, and medical ethicists to identify and demonstrate the technical methods, processes for implementation, and guardrails for responsible equity analyses using imputed data.

These efforts should align with federal policymaking activities and also should use a community-informed process that involves patients and community stakeholders and considers local context, populations, and conditions.

Findings and recommendations discussed here are compiled in a report, Current Health Plan Approaches to Race and Ethnicity Data Collection and Recommendations for Future Improvements.

Publication Details



Jeni Soucie, Product Manager, National Committee for Quality Assurance


Jeni Soucie et al., “How Health Plans Could Improve the Collection of Race and Ethnicity Data,” To the Point (blog), Commonwealth Fund, Apr. 3, 2023.