Does Medicare’s merit-based incentive payment system really work?

By Jim Schnabel
Weill Cornell Medicine

December 8, 2022

A Medicare system that is meant to assess and incentivize health care quality with pay adjustments may not be working as intended, according to a study from researchers at Weill Cornell Medicine.

In the study, published Dec. 6 in the Journal of the American Medical Association (JAMA), the researchers analyzed data on more than 80,000 primary care physicians enrolled in Medicare’s Merit-Based Incentive Payment System (MIPS). The program assigns scores based on quality, costs, electronic health record-related standards and physician participation in activities that improve clinical practice. Physician payment rates are then adjusted based on these scores. Using Medicare datasets to evaluate a consistent set of measures, including primary care process and patient outcome measures, the researchers found that physicians’ performance was not reliably associated with their MIPS scores.

“What these results suggest is that the MIPS program’s accuracy in identifying high- versus low-performing providers is really no better than chance,” said study lead author Amelia Bond, an assistant professor of population health sciences at Weill Cornell Medicine.

The MIPS program was introduced in 2017 as a consolidation of other Medicare incentive programs, and by 2019 included virtually all eligible physicians. MIPS participation places substantial reporting and other administrative burdens on physicians, and the program’s accuracy in assessing physician quality has often been questioned – and never comprehensively evaluated.

The new study focused on a sample of 80,246 primary care physicians and 3.4 million patients they treated in 2019, using Medicare datasets, including claims records. Doctors who participate in MIPS choose six from a total of 257 possible performance measures to report, only one of which must be an outcome measure, such as hospital admission for a particular illness. For their analysis, the researchers chose a set of measures relevant to primary care, with a greater emphasis on outcomes. These measures included annual diabetes blood tests; eye exams for patients with diabetes; breast cancer screening; annual flu vaccination; number of emergency department visits; and hospital admissions for conditions such as diabetes, chronic obstructive pulmonary disease (COPD) and heart failure.

The results indicated that when performance is judged based on broad outcome indicators most relevant to primary care, there is no clear connection with MIPS ratings. Compared with doctors with high MIPS scores, doctors with low MIPS scores performed significantly worse, on average, on three of the five “process” measures (diabetes blood tests, diabetic eye exams, mammography screening), but marginally better on the other two process measures (flu vaccination, tobacco screening). For patient outcome measures, the low-scoring doctors performed significantly better on one measure (emergency room visits per 1,000 patients), significantly worse on another (all-cause hospitalization per 1,000 patients) and not significantly differently on the other four outcomes.

The analysis found that 19% of low-scoring doctors had combined performance ratings in the top fifth, or quintile, while 21% of high-scoring doctors had ratings in the lowest quintile – again implying no clear relationship between MIPS scoring and true performance.

The researchers are not certain why MIPS scores may not capture clinical performance. However, based on their and others’ prior research, they suspect that there is inadequate risk adjustment for physicians who care for more medically complex and socially vulnerable patients and that smaller, independent primary care practices have fewer resources to dedicate to quality reporting, leading to low MIPS scores.

“MIPS scores may reflect doctors’ ability to keep up with MIPS paperwork more than it reflects their clinical performance,” Bond said.

The analysis also found that doctors with superior performance, but low MIPS scores, tended to have practices catering to a greater number of sicker and lower-income patients, compared with doctors with poor performance and low MIPS scores.

“It is concerning that physicians who care for medically complex or socially vulnerable patients are at risk for financial penalties even when they provide care that seems comparable or better than that from other physician practices,” said study’s senior author Dr. Dhruv Khullar, an assistant professor of medicine and population health sciences at Weill Cornell Medicine.

The researchers don’t expect the MIPS program to be eliminated, they said, but they hope that their findings will inform future improvements to it.

Jim Schnabel is a freelance writer for Weill Cornell Medicine.

Health, Nutrition & Medicine