In-depth study of Medicaid, the program that insures 93 million people with low income, has been compromised by a lack of data. Only comprehensive, high-quality claims data can provide the granular information on Medicaid enrollees’ characteristics, use of health care, and spending that are needed to inform sound policy decisions.
In 2019, the Centers for Medicare and Medicaid Services (CMS) released the Transformed Medicaid Statistical Information System Analytic Files (TAF) in a bid to make Medicaid claims data from all states and territories available in a national repository. TAF provides researchers with higher-quality, more usable information about enrollees relative to previous datasets.
The TAF are inspiring a new generation of Medicaid research. Studies using TAF have examined maternal morbidity, reimbursement for psychiatric services, racial and ethnic disparities in medication for opioid use disorder, and the health effects of segregation, among other topics.
However, TAF are expensive to obtain and complex to analyze. Ongoing data-quality issues, moreover, limit their full potential. In 2022, we partnered with AcademyHealth to form the Medicaid Data Learning Network to provide a mechanism for TAF researchers to share best practices. That experience has identified three important opportunities for state and federal policymakers to improve TAF going forward.
Recommendation 1: Address Missing Demographic Information
Missing demographic information in TAF limits the ability of researchers to study disparities in health care utilization and outcomes across populations. The quality and completeness of demographic characteristics, including race, ethnicity, income, primary language, and citizenship, vary significantly in TAF by state and by subpopulation. In particular, self-reporting of race and ethnicity is not mandatory for Medicaid enrollees, and states differ in their approaches to collecting these data. We encourage CMS to develop best practices for encouraging enrollees to provide race and ethnicity data at the time of enrollment, standardizing the reporting of other demographic variables, and furnishing states with additional resources.
Recommendation 2: Enhance Transparency of Managed Care Payments
More than 70 percent of Medicaid enrollees are in some form of managed care — private plans that states contract with to provide benefits. Yet payments from managed care plans to provider organizations are redacted in TAF, making it difficult for researchers to examine spending or evaluate the effects of managed care more broadly. Making these data available is an important first step for CMS and states to ensure appropriate oversight and evaluation of Medicaid managed care.
This alone would likely not be sufficient, however: a recent report from the U.S. Department of Health and Human Services Office of Inspector General found that half of the 39 states analyzed did not provide accurate data on what plans paid to providers. CMS should mandate the collection of high-quality payment data from all managed care plans and make these data available to researchers — unredacted — in TAF. Some states already make these payment data available in their own Medicaid claims datasets. Without this transparency, it’s impossible to assess such things as how Medicaid spending varies across enrollee populations or how providers respond to managed care plans’ financial incentives.
Recommendation 3: Improve Accessibility to the New Medicaid Analytic Files
TAF data files are unavailable to many research teams because of their size, complexity, and price. One year of data can cost anywhere from $35,000 to $65,000, depending on how the data are accessed. TAF files are updated over time to address data-quality issues, but CMS generally charges researchers additional fees to use the refreshed data. These costs make using the updated, highest-quality data prohibitive to some researchers, potentially resulting in evidence generation based on files with known quality concerns. CMS could help improve access by lowering TAF access fees or offering a sliding scale based on an organization’s resources.
Conclusion
The historical lack of a high-quality, national claims dataset for the Medicaid program has significantly hindered evidence-based policymaking for America’s single-largest payer of health care services. While the release of TAF nearly four years ago represents a significant step toward improving transparency, persistent data-quality issues and access barriers limit TAF’s potential. State and federal policymakers can prioritize working together to improve TAF — including its quality, accessibility, and usability — to ensure that data can inform effective Medicaid policy.