Junkfood Science: Are P4P measures discriminatory?

October 28, 2008

Are P4P measures discriminatory?

Government-funded healthcare through the Centers for Medicare and Medicaid Services has a pay-for-performance system that links reimbursements to hospitals according to their adherence to certain process performance measures. Hospitals are also graded based on these P4P measures and the CMS makes their grades public record, to increase the incentives of hospitals to abide by them. A new study evaluating a P4P program for the management of heart attack patients has been reported in this week’s news as finding that hospitals caring for vulnerable populations — the elderly, women, the poor, uninsured and minorities — are most penalized by Medicare P4P measures and receive less funding than those caring for young, wealthier, insured and white Americans.

That’s not quite what the study found. The story is more complex than that and doesn't just affect hospitals, but each of us.

Pay-for-Performance measures have been covered here in depth. They are basically procedure metrics — prescriptions written, preventive screening tests ordered, labwork monitored, and procedures done on all patients according to their diagnoses or characteristics — issued by third-party payers (insurers and CMS) that are used to grade, and financially reward and penalize healthcare providers and hospitals based on their adherence. Third-party payers call them measures of “quality” of care, leading most people to mistakenly believe that these measures have been shown to improve patient outcomes, reduce mortality, or to be cost effective. P4P are controversial, with a variety of concerns raised about them by medical professionals, including the fact they are not always supported by the best clinical evidence, are influenced by troubling conflicts of interest, may not reflect the preferences of patients or be medically best for all patients and allow doctors to apply their clinical judgement, raise prescription and healthcare costs, expose some patients to unnecessary risks, and could have unintended consequences of disincentivizing the care of the sickest and most difficult to treat patients. The CMS, alone, currently has 134 performance measures for doctors’ practices, as Dr. Wes last counted a week ago.

This new report** was published in the current issue of the Journal of the American Medical Association and was conducted by researchers with the American Heart Association’s Get With the Guidelines program, funded in part by Merck-Schering Plough pharmaceuticals. This program uses a Web-based patient management tool to collect clinical data, guide providers’ decisions (like the “ding dings” Dr. Wes described), and provide real-time electronic medical record reporting. The authors analyzed data from 574 hospitals that participated in the AHA’s Get With the Guidelines program from January 2, 2000 to March 28, 2008.

They compared the hospitals’ performance on eight P4P measures included in the CMS score for 148,472 patients with confirmed diagnoses of an acute heart attack. The P4P measures included were all interventions administered in the hospital after an acute heart attack and included: aspirin at admission and on discharge, beta-blockers at admission and on discharge, angiotensinconverting enzyme inhibitors for left ventricular systolic dysfunction, smoking cessation counseling, thrombolytics within 30 minutes of arrival, and primary percutaneous coronary interventions (commonly known as angioplasty) within 90 minutes of arrival.

As the authors explained: “CMS rewards hospitals performing in the top 20% in the pay-for-performance program based on their composite adherence score, such that the top 10% are eligible to receive a 2% bonus payment from CMS and the next 10% are eligible to receive a 1% bonus payment. In contrast, hospitals performing in the bottom 20% are likely to receive reductions in their payments in the future.”

These authors used the same methods as those used by the CMS, for public reporting and for P4P reimbursements, to calculate each hospital’s composite adherence score. They found that compliance with the P4P process measures varied considerably, from 96% for giving aspirin at discharge to 40.4% for giving fibrinolytic therapy within 30 minutes of admission. They ranked hospitals according to their performance on each of these eight hospital P4P measures in the AHA program.

As quoted in news reports, Dr. Eric Peterson, M.D., M.P.H., of Duke Clinical Research Institute in Durham, NC, and coauthors concluded that “hospitals serving large groups of the elderly, women, poor, uninsured, or African-American patients might have problems competing with institutions whose patients are younger, wealthy, insured, and white.” According to the news and the AHA press release, hospitals’ ability to qualify for financial incentives is affected by the patients’ age, race/ethnicity, income, insurance status and gender. “Under the current model, hospitals may be doing everything right and still be penalized,” the lead author stated.

But, these are hospital-based P4P measures, meaning every patient with the same diagnosis is supposed to get the same ‘evidence-based’ care. The patients’ income, ethnicity, insurance status, etc. should have nothing to do with whether the hospitals comply with the P4P measures. Dr. Peterson is quoted in the press release as saying that their study revealed the need to level the playing field so that hospitals serving more minority, elderly, sick or uninsured can compete fairly with others. While no one would argue that providers who care for sicker, minority patients should not be penalized, is that what’s really going on? Or is the failure of smaller hospitals with fewer resources to comply with P4P measures more about a problem with the P4P programs and these hospitals seeing less justification for the expense and regulatory paperwork and little difference in patient outcomes? Did this study really support pointing the blame on patients?

While the news leads most readers to attribute the greatest differences in adherence of these P4P programs to the patients, caution is warranted in interpreting those correlations. In the JAMA report itself, the authors said that smaller, nonacademic hospitals — more often inner city hospitals in the Northeast and the South serving more minority populations — are generally under-resourced (including staffing shortages, lack electronic medical records/IT support, are shorter on capital and operate on smaller budgets. These hospitals, they reported, are less likely to comply with the CMS and Get With the Guideline program measures. These are the hospitals that also depend more on Medicare’s payments and other government subsidies to enable them to devote more resources to performance (i.e. P4P) improvements, noted the authors.

The most significant differences between the highest and lowest complying hospitals were characteristics seen in the hospitals: size (407 versus 161 beds), being a teaching hospital (42.6% versus 37.3%), and geographic location (32.8% versus 11.4% in the Midwest, for example, and 22.9% versus 35.9% in the South). Whites were 17% more prevalent among the higher ranking hospitals compared with the lowest (81.7% versus 64.4%, respectively), a correlation most explained by the hospitals’ locations.

The hospitals also differed in the services and procedures provided. Most notably, fewer of the smaller, nonacademic hospitals offered angioplasty, which means they automatically missed credit for that P4P measure.

In contrast with news stories, the study’s data reported no statistical differences in the ages or BMIs of the patient populations in the highest and lowest ranking hospitals, and only a 1.3% difference in gender. Uninsured patients were more prevalent among the larger top-ranking hospitals, compared with the smaller hospitals (7.8% versus 5.3%). And no patient household income, education or poverty data was even reported.

While the authors reported that hospitals with poor adherence to P4P measures were more likely to have patients with a history of comorbidities such as diabetes, heart failure, chronic atrial fibrillation, renal insufficiency, and lower left ventricular ejection fraction; the patients were less likely to have a history of chronic obstructive pulmonary disease, hypertension, previous heart attack, renal dialysis or strokes. The comorbidities were different, but not greater in number among the high and low ranking hospitals. Only 6 of the 14 comorbidities and health risk factors in patient’s medical histories were higher among the lowest complying hospitals.

According to the JAMA article, the authors then adjusted the P4P reimbursement calculations “for patient case mix and treatment opportunity.” That means they recalculated reimbursements primarily according to the patient’s comorbid conditions that might have medically influenced the appropriateness of adhering to their P4P guidelines, and according to the angioplasties, which influenced the ability of a hospital to adhere to their guidelines. This is about the efficacy of the P4P reimbursements.

It had much less to do with adjusting for discriminatory patient characteristics such as income, insurance, gender, and BMI, which were not reported as being significantly associated with poor hospital adherence. And there is no medical basis for P4P compliance to vary based on a patient’s skin color.

The authors reported that changing the P4P adherence calculations used for determining reimbursements changed the rank of 16.5% of the hospitals — half moving up and half moving down. No detailed information was provided on the hospital characteristics with these new calculations and how this might have leveled the playing field, but their Table 3 suggests that the hospitals were slightly less segregated by size, which would bolster the financial incentives for smaller hospitals to adopt the P4P measures. “Thus, our data suggest that the current method of ranking hospitals based on an unadjusted performance measures composite score, as done for the pay-for performance system, may be less than optimal,” they concluded.

The press release and news stories had other information that likely confused readers:

Study authors noted the need for all hospitals to show quality performance among high risk patient groups. They also suggest the need for hospitals to collect and report comprehensive clinical data that will allow them to identify and close treatment gaps that arise based on their patient mix. "This study further illustrates the important role that the American Heart Association's Get With The Guidelines program is playing in advancing the science of measuring and improving the quality of cardiovascular care" said Gregg C. Fonarow, M.D., FACC, FAHA, chairman of the American Heart Association's Get With the Guidelines Steering Committee.

While P4P programs are perfecting the electronic gathering and surveillance of data on patients and provider practices, and the management of healthcare processes, few medical professionals or consumers would equate those with quality of care. What matters to people isn’t how many “ding dings” are checked off a list, how many prescriptions are written, and how many procedures are done according to a third-party guideline, but if the care doctors provide patients actually improves clinical outcomes, reduces suffering and disease, and saves lives.

Get With the Guidelines has not published a single study demonstrating that compliance with the program reduces mortality or improves clinical outcomes for cardiac patients. Going back years, every study cited on the website examines process metrics and adherence to the guidelines. The latest study described in an AHA press release in support of the program, for instance, was published last month in Archives of Internal Medicine. It examined adherence to performance measures among the 223 hospitals participating in the AHA program compared to other hospitals in the Hospital Compare database. No paper has been posted, however, to reveal how much ROMI revenue these P4P programs bring to pharmaceutical companies or the health IT industry.

In this new JAMA paper, the authors also cautioned that a limitation of their analysis was that while the AHA electronic medical management program “contained editing capabilities to ensure data entered were consistent with plausible ranges, Get With the Guidelines has not, to date, performed a national audit of its database” to verify the accuracy or reliability of the data.

Quality of care measures are not the same as measures that improve clinical outcomes. As recently examined here in detail, the two recent large studies of P4P measures for heart patients found these expensive programs have not been shown to reduce mortality or improve outcomes for the patients.

In the first, researchers performed a 3-year analysis of the CMS Hospital Quality Incentive Demonstration project — billed as the largest federally-sponsored P4P program to date in the United States, costing $17.5 million tax dollars for the first two years. They found no improvement in health outcomes or mortality rates among the heart attack patients. The second study examined the Organized Program to Initiate Lifesaving Treatment in Hospitalized Patients With Heart Failure (OPTIMIZE-HF) project, the largest national hospital-based P4P program for patients hospitalized with heart failure in the United States. There was no statistical significant change in post-discharge mortality, rehospitalizations or in-hospital mortality.

As the chairman of cardiovascular medicine at the Cleveland Clinic, Dr. Steven E. Nissen, told the Wall Street Journal, on June 6th, these results “suggest we ought to slow down a minute before going into pay-for-performance.” Throwing more money at them or redistributing the reimbursements won’t change the fundamental issues.

© 2008 Sandy Szwarc



** Financial Disclosures:

Dr Hernandez reports receiving research support from GlaxoSmithKline, Johnson & Johnson (Scios Inc), Medtronic, Novartis, and Roche Diagnostics; and honoraria from Astra-Zeneca, Novartis, Sanofi-Aventis, and Thoratec Corporation.

Dr Peterson reports receiving research support from Schering Plough, BMS/Sanofi; and serving as the principal investigator for the American Heart Association’s (AHA’s) Get With the Guidelines Analytical Center.

Drs Hernandez and Peterson report detailed listings of financial disclosures at http://www.dcri.duke.edu/research/coi.jsp.

Dr Rumsfeld reports receiving an honorariun for participating on the scientific research advisory board of United Healthcare.

Dr Fonarow reports receiving research grants from GlaxoSmithKline, Medtronic, Merck, Pfizer, and the National Institutes of Health; consulting for AstraZeneca, Bristol-Myers Squibb, GlaxoSmithKline, Medtronic, Merck, Novartis, Pfizer, Sanofi-Aventis, and Schering Plough; receiving honoraria from AstraZeneca, Abbott, Bristol-Myers Squibb, GlaxoSmithKline, Medtronic, Merck, Novartis, Pfizer, Sanofi-Aventis, and Schering Plough; and serving as chair of the AHA’s Get With the Guidelines Steering Committee.

Drs Mehta and Liang and Ms Karve report no disclosures.

Role of the Sponsor:

Merck and Schering Plough had no role in the design and conduct of the study; in the collection, management, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript. The AHA provides Get With the Guidelines program management with a volunteer steering committee and AHA staff. The manuscript was submitted to the AHA for review and approval prior to submission.

Bookmark and Share