Economic Evaluation Methods Part I: Interpreting Cost-Effectiveness Acceptability Curves and Estimating Costs

By Erik Landaas, Elizabeth Brouwer, and Lotte Steuten

One of the main training activities at the CHOICE Institute at the University of Washington is to instruct graduate students how to perform economic evaluations of medical technologies. In this blog post series, we give a brief overview of two important economic evaluation concepts. Each one of the concepts are mutually exclusive and are meant to stand alone. The first of this two-part series describes how to interpret a cost-effectiveness acceptability curve (CEAC) and then delves into ways of costing a health intervention. The second part of the series will describe two additional concepts: how to develop and interpret cost-effectiveness frontiers and how multi-criteria decision analysis (MCDA) can be used in Health Technology Assessment (HTA).

 

Cost-Effectiveness Acceptability Curve (CEAC)

The CEAC is a way to graphically present decision uncertainty around the expected incremental cost-effectiveness of healthcare technologies. A CEAC is created using the results of a probabilistic analysis(PA).[1] PA involves simultaneously drawing a set of input parameter values by randomly sampling from each parameter distribution, and then storing the model results.  This is repeated many times (typically 1,000 to 10,000), resulting in a distribution of outputs that can be graphed on the cost-effectiveness plane. The CEAC reflects the proportion of results that are considered ‘favorable’ (i.e. cost effective) in relation to a given cost-effectiveness threshold.

The primary goal of a CEAC graph is to inform coverage decisions among payers that are considering a new technology, comparing one or more established technologies that may include the standard of care. The CEAC enables a payer to determine, over a range of willingness to pay (WTP) thresholds, the probability that a medical technology is considered cost-effective in comparison to its appropriate comparator (e.g. usual care), given the information available at the time of the analysis. A WTP threshold is generally expressed in terms of societal willingness to pay for an additional life year or quality-adjusted life year (QALY) gained. In the US, WTP thresholds typically range between $50,000 – $150,000 per QALY.

The X-axis of a CEAC represents the range of WTP thresholds. The Y-axis represents the probability of each comparator being cost-effective at a given WTP threshold, and ranges between 0% and 100%. Thus, it simply reflects the proportion of simulated ICERs from the PA that fall below the corresponding thresholds on the X-axis.

Figure 1. The Cost-Effectiveness Acceptability Curve

CEAC

Coyle, Doug, et al. “Cost-effectiveness of new oral anticoagulants compared with warfarin in preventing stroke and other cardiovascular events in patients with atrial fibrillation.” Value in health 16.4 (2013): 498-506.

Figure 1 shows CEACs for five different drugs, making it easy for the reader to see that at the lower end of the WTP threshold range (i.e. $0 – $20,000 per QALY), warfarin has the highest probability to be cost-effective (or in this case “optimal”). At WTP values >$20,000 per QALY, dabigatran has the highest probability to be cost-effective. All the other drugs have a lower probability of being cost-effective compared to warfarin and dabigatran at every WTP threshold. The cost-effectiveness acceptability frontier in Figure 1 follows along the top of all the curves and shows directly which of the five technologies has the highest probability of being cost-effective at various levels of the WTP thresholds.

To the extent that the unit price of the technology influences the decision uncertainty, a CEAC can offer insights to payers as well as manufacturers as they consider a value-based price. For example, a lower unit price for the drug may lower the ICER and, all else equal, this increases the probability that the new technology is considered cost-effective at a given WTP threshold. Note, that when new technologies are priced such that the ICER falls just below the WTP for a QALY, (e.g. an ICER of $99,999 when the WTP is $100,000) the decision uncertainty tends to be substantial, often around 50%. If decision uncertainty is perceived to be ‘unacceptably high’, it can be recommended to collect further information to reduce decision uncertainty. Depending on the drivers of decision uncertainty, for example in case of stochastic uncertainty in the efficacy parameters, performance-based risk agreements (PBRAs) or managed entry schemes may be appropriate tools to manage the risk.

Cost estimates

The numerator of most economic evaluations for health is the cost of a technology or intervention. There are several ways to arrive at that cost, and choice of method depends on the context of the intervention and the available data.

Two broadly categorized methods for costing are the bottom-up methodand the top-down method. These methods, described below, are not mutually exclusive and may complement each other, although they often do not produce the same results.

costs

Source of Table: Mogyorosy Z, Smith P. The main methodological issues in costing health care services: a literature review. 2005.

The bottom-up method is also known as the ingredients approach or micro-costing. In this method, the analyst identifies all the items necessary to complete an intervention, such as medical supplies and clinician time, and adds them up to estimate the total cost. The main categories to consider when calculating costs via the bottom-up method are medical costs and non-medical costs. Medical costs can be direct, such as the supplies used to perform a surgery, or indirect, such as the food and bed used for inpatient care. Non-medical costs often include costs to the patient, such as transportation to the clinic or caregiver costs. The categories used when estimating the total cost of an intervention will depend on the perspective the analyst takes (perspectives include patient, health system, or societal).

The bottom-up approach can be completed prospectively or retrospectively, and can be helpful for planning and budgeting. Because the method identifies and values each input, it allows for a clear breakdown as to where dollars are being spent. To be accurate, however, one must be able to identify all the necessary inputs for an intervention and know how to value capital inputs like MRI machines or hospital buildings. The calculations may also become unwieldy on a very large scale. The bottom-up approach is often used in global health research, where medical programs or governmental agencies supply specific items to implement an intervention, or in simple interventions where there are only a few necessary ingredients.

The top-down estimation approach takes the total cost of a project and divides it by the number of service units generated. In some cases, this is completed simply looking at the budget for a program or an intervention and then dividing that total by the number of patients. The top-down approach is useful because it is a simple, intuitive measurement that captures the actual amount of money spent on a project and the number of units produced, particularly for large projects or organizations. Compared to the bottom-up approach, the top-down approach can be much faster and cheaper. The top-down approach can only be used retrospectively, however, and may not allow for the breakdown of how the money was spent or be able to identify variations between patients.

While the final choice will depend on several factors, it makes sense to try and think through (or model) which of the cost inputs are likely to be most impactful on the model results. For example, the costs of lab tests may most accurately be estimated by a bottom-up costing approach. However, if these lab costs are likely to be a fraction of the cost of treatment, say a million dollar cure for cancer, then going through the motions of a bottom-up approach may not be the most efficient way to get your PhD-project done in time. In other cases, however, a bottom-up approach may provide crucial insights that move the needle on the estimated cost-effectiveness of medical technologies, particularly in settings where a lack of existing datasets is limiting the potential of cost-effectiveness studies to inform decisions on the allocation of scarce healthcare resources.

[1]Fenwick, Elisabeth, Bernie J. O’Brien, and Andrew Briggs. “Cost‐effectiveness acceptability curves–facts, fallacies and frequently asked questions.” Health economics 13.5 (2004): 405-415.

Commonly Misunderstood Concepts in Pharmacoepidemiology

By Erik J. Landaas, MPH, PhD Student and Naomi Schwartz, MPH, PhD Student

 

Epidemiologic methods are central to the academic and research endeavors at the CHOICE institute. The field of epidemiology fosters the critical thinking required for high quality medical research. Pharmacoepidemiology is a sub-field of epidemiology and has been around since the 1970’s. One of the driving forces behind the establishment of pharmacoepidemiology was the Thalidomide disaster. In response to this tragedy, laws were enacted that gave the FDA authority to evaluate the efficacy of drugs. In addition, drug manufacturers were required to conduct clinical trials to provide evidence of a drug’s efficacy. This spawned a new and important body of work surrounding drug safety, efficacy, and post-marketing surveillance.[i]

In this article, we break down three of the more complex and often misunderstood concepts in pharmacoepidemiology: immortal time bias, protopathic bias, and drug exposure definition and measurement.

 

Immortal Time Bias

In pharmacoepidemiology studies, immortal time bias typically arises when the determination of an individual’s treatment status involves a delay or waiting period during which follow-up time is accrued. Immortal time bias is a period of follow-up during which, by design, the outcome of interest cannot occur. For example, the finding that Oscar winners live longer than non-winnersis a result of immortal time bias. In order for an individual to win an Oscar, he/she must live long enough to receive the award.  A pharmacoepidemiology example of this is depicted in Figure 1. A patient who receives a prescription may survive longer because he/she must live long enough to receive a prescription while a patient who does not receive a prescription has no survival requirements.  The most common way to avoid immortal time bias is to use a time-varying exposure variable. This allows subjects to contribute to both unexposed (during waiting period) and exposed person time.

 

Figure 1. Immortal Time Bias

Picture2_pharmepi post.png 

Lévesque, Linda E., et al. “Problem of immortal time bias in cohort studies: example using statins for preventing progression of diabetes.” Bmj 340 (2010): b5087.

Protopathic Bias or Reverse Causation

Protopathic bias occurs when a drug of interest is initiated to treat symptoms of the disease under study before it is diagnosed. For example, early symptoms of inflammatory bowel disease (IBD) are often consistent with the indications for prescribing proton pump inhibitors (PPIs). Thus, many individuals who develop IBD have a history of PPI use. A study to investigate the association between PPIs and subsequent IBD would likely conclude that taking PPIs causes IBD when, in fact, the IBD was present (but undiagnosed) before the PPIs were prescribed.  This scenario is illustrated by the following steps:

  • Patient has early symptoms of an underlying disease (e.g. acid reflux)
  • Patient goes to his/her doctor and gets a drug to address symptoms (e.g. PPI)
  • Patient goes on to develop a diagnosis of having IBD (months or even years later)

It is easy to conclude from the above scenario that PPIs cause IBD, however the acid reflux was actually a manifestation of underlying IBD that was not yet diagnosed.  Protopathic bias occurs in this case because of the lag time between first symptoms and diagnosis. One effective way to address protopathic bias is by excluding exposures during the prodromal period of the disease of interest.

 

Drug Exposure Definition and Measurement 

Defining and classifying exposure to a drug is critical to the validity of pharmacoepidemiology studies. Most pharmacoepidemiology studies use proxies for drug exposure, because it is often impractical or impossible to measure directly (e.g. observing a patient take a drug, monitoring blood levels). In lieu of actual exposure data, exposure ascertainment is typically based on medication dispensing records. These records can be ascertained from electronic health records, pharmacies, pharmacy benefit managers (PBMs), and other available healthcare data repositories. Some of the most comprehensive drug exposure data are available among Northern European countries and large integrated health systems such as Kaiser Permanente in the United States. Some strengths of using dispensing records to gather exposure data are:

  • Easy to ascertain and relatively inexpensive
  • No primary data collection
  • Often available for large sample sizes
  • Can be population based
  • No recall or interviewer bias
  • Linkable to other types of data such as diagnostic codes and labs

Limitations of dispensing records as a data source include:

  • Completeness can be an issue
  • Usually does not capture over-the-counter (OTC) drugs
  • Dispensing does not guarantee ingestion
  • Often lacks indication for use
  • Must make some assumptions to calculate dose and duration of use

Some studies collect drug exposure data using self-report methods (e.g. interviews or surveys). These methods are useful when the drug of interest is OTC and thus not captured by dispensing records. However, self-reported data is subject to recall bias and requires additional considerations when interpreting results. Alternatively, some large epidemiologic studies require patients to bring in all their medications when they do their study interviews (eg. bring your brown bag of medications). This can provide a more reliable method of collecting medication information than self-report.

It is also important to consider the risk of misclassification of exposure. When interpreting results, remember that differential misclassification (different for those with and without disease) can result in either an inflated measure of association, or a measure of association that is closer to the null. In contrast, non-differential misclassification (unrelated to the occurrence or presence of disease) shifts the measure of association closer to the null. For further guidance on defining drug exposure, please look at Figure 2.

 

Figure 2. Checklist: Key considerations for defining drug exposure

Picture3_pharmepi post.png
Velentgas, Priscilla, et al., eds. Developing a protocol for observational comparative effectiveness research: a user’s guide. Government Printing Office, 2013.

As alluded to above, pharmacoepidemiology is a field with complex research methods. We hope this article clarifies these three challenging concepts.

 

 

[i](Pinar Balcik, Gulcan Kahraman “Pharmacoepidemiology.” IOSR Journal of Pharmacy (e)-ISSN: 2250-3013, (p)-ISSN: 2319-4219 Volume 6, Issue 2 (February 2016), PP. 57-62)

ISPOR’s Special Task Force on US Value Assessment Frameworks: A summary of dissenting opinions from four stakeholder groups

By Elizabeth Brouwer


IsporLogo2018bg

The International Society for Pharmacoeconomics and Outcomes Research (ISPOR) recently published an issue of their Value in Health (VIH) journal featuring reports on Value Assessment Frameworks. This marks the culmination of a Spring 2016 initiative “to inform the shift toward a value-driven health care system by promoting the development and dissemination of high-quality, unbiased value assessment frameworks, by considering key methodological issues in defining and applying value frameworks to health care resource allocation decisions.” (VIH Editor’s note) The task force summarized and published their findings in a 7-part series, touching on the most important facets of value assessment. Several faculty of the CHOICE Institute at the University of Washington authored portions of the report, including Louis Garrison, Anirban Basu and Scott Ramsey.

In the spirit of open dialogue, the journal also published commentaries representing the perspectives of four stakeholder groups: payers (in this case, private insurance groups), patient advocates, academia, and the pharmaceutical industry. While supportive of value assessment in theory, each commentary critiqued aspects of the task force’s report, highlighting the contentious nature of value assessment in the US health care sector.

Three common themes emerged, however, among the dissenting opinions:

  1. Commenters saw CEA as a flawed tool, on which the task force placed too much emphasis

All commentaries except the academic perspective bemoaned the task force’s reliance on cost-effectiveness analysis. Payers, represented in an interview of two private insurance company CEOs, claimed that they do not have a choice on whether to cover most new drugs. If it’s useful at all, then, CEA informs the ways that payers distinguish between drugs of the same class. The insurers went on to claim that they are more interested in the way that CEA can highlight high-value uses for new drugs, as most are expected to be expensive regardless.

Patient advocates also saw CEA as a limited tool and were opposed to any value framework overly dependent on the cost per QALY paradigm.  The commentary equated CEAs to clinical trials—while informative, they imperfectly reflect how a drug will fare in the real world. Industry representatives, largely representing the PhRMA Foundation, agreed that the perspective provided by CEAs is too narrow and shouldn’t be the cornerstone for value assessment, at least in the context of coverage and reimbursement decisions.

  1. Commenters disagreed with how the task force measured benefits (the QALY)

All four commentaries noted the limitations the quality-adjusted life-year (QALY). The patient advocates and the insurance CEOs both claimed that the QALY did not reflect their definition of health benefits. The insurance representatives reminded us that their businesses don’t give weight to societal value because it is not in their business model. Similarly, the patient advocate said the QALY did not reflect patient preferences, where value is more broadly defined. The QALY, for example, does not adequately capture the influence of health care on functionality, ability to work, or family life. The patient advocate noted that while the task force identified these flaws and their methodological difficulties, it stopped short of recommending or taking any action to address them.

Industry advocates wrote that what makes the QALY useful—it’s ability to make comparisons across most health care conditions and settings—is also what makes it ill-suited for use in a complex health care system. Individual parts of the care continuum cannot be considered in isolation. They also noted that the QALY is discriminatory to vulnerable populations and was not reflective of their customers’ preferences.

Mark Sculpher, Professor at the University of York representing health economic theory and academia, defended the QALY to an extent, noting that the measure is the most suitable available unit for measuring health. He acknowledged the QALY’s limitations in capturing all the benefits of health care, however, and noted that decision makers and not economists should be the ones defining benefit.

 

  1. Commenters noticed a disconnect between the reports and social/political realities

Commenters seemed disappointed that the task force did not go further in directing the practical application of value assessment frameworks within the US health care sector. The academic representative wrote that, while economic underpinnings are important, ultimately value frameworks need to be useful to, and reflect the values of, the decision makers. He argued that decision-makers’ buy-in is invaluable, as they hold the power to implement and execute resource allocation. Economics can provide a foundation for this but should not be the source of judgement relating to value if the US is going to take-up value assessment frameworks to inform decisions.

Patient advocates and industry representatives went further in their criticism, saying the task force seemed disconnected from the existing health care climate. The patient advocate author felt the task force ignored the social and political realities in which health care decisions are made. Industry representatives pointed out that current policy, written in the Patient Protection and Affordable Care Act (PPACA), prohibited a QALY-based CEA because most decision makers in the US believe it inappropriate for use in health care decision making. Both groups wondered why the task force continued to rely on CEA methodology when it had been prohibited by the public sector.

 

The United States will continue to grapple with value assessment as it seeks to balance innovation with budgetary constraints. The ISPOR task force ultimately succeeded in its mission, which was never to specify a definitive and consensual value assessment framework, but instead to consider “key methodological issues in defining and applying value frameworks to health care resource allocation decisions.”

The commentaries also succeeded in their purpose: highlighting the ongoing tensions in creating value assessment frameworks that stakeholders can use. There is a need to improve tools that value health care to assure broader uptake, along with a need to accept flawed tools until we have better alternatives. The commentaries also underscore a chicken-and-egg phenomenon within health care policy. Value assessment frameworks need to align with the goals of decision-makers, but decision-makers also need value frameworks to help set goals.

Ultimately, Mark Sculpher may have summarized it best in his commentary. Value assessment frameworks ultimately seek to model the value of health care technology and services. But as Box’s adage reminds us: although all models are wrong, some are useful. How to make value assessment frameworks most useful moving forward remains a lively, complex conversation.

CHOICE Institute Director Discusses Amazon Health Care Announcement

You may have heard the big news that came out of Seattle recently: Amazon is partnering with Berkshire Hathaway and JPMorgan Chase to address health care costs and quality by creating an independent health care company for their employees. Further details of their plan remain a secret to the general public, and the companies are likely still working out logistics amongst themselves. Given the 1.2 million employees involved in the three companies, however, many in the health care industry are thinking through the likely impact of this new partnership.

amazon

Director of the CHOICE Institute and professor of health economics at the University of Washington, Anirban Basu was recently referenced in two regional blogs describing the potential significance of the proposed plan:

According to Anirban Basu, a health care economist at the University of Washington, the trio could do a number of things to reform the health care system just by their sheer size and power alone. While most small and individual health care buyers have little power when it comes to directly negotiating with either health care providers or pharmaceutical companies, this partnership could change that—at least for those who qualify for it. Currently, price negotiating falls on third-party pharmacy benefit managers, at a cost then passed on to consumers.

Besides taking on bargaining power, Basu says Amazon may even open primary care clinics for their employees, but this could expand beyond their base.

It is important to note that while the new health plan may eventually have industry-wide effects, its scope will be limited to the companies’ employees at the beginning. And it is hardly a new phenomenon for employer groups to choose self-insurance as a means to control costs.

Henry Ford was one of the first industry giants to start his own health care insurance and delivery system in 1915, and America’s largest managed care organization, Kaiser Permanente, originally started as a health care program for employees of the Kaiser steel mills and shipyards.

Another important item to note is that America’s health care system has already been undergoing fundamental changes. While the United States Congress remains divided about how to move forward with the Affordable Care Act and improve the nation’s health care system overall, private health care companies are making their own moves. Hospital and insurance markets are becoming increasingly consolidated (with less competition to control prices), and some health care stakeholders are partnering and consolidating in innovative ways to capture market share (for example, the pharmacy company CVS Health just bought insurance company Aetna in January 2018).

Amazon’s new health care company could simply be joining these trends: historic trends of self-insuring companies to cut costs or newer trends of consolidating aspects of American health care for increased market power. However, it is entirely conceivable that the potent combination of Amazon (a technology industry giant), JPMorgan Chase (a banking industry giant), and Berkshire Hathaway (an investment giant) will bring something new to the table. Vox and StaTECHery are among many media outlets offering interesting predictions.

After the announcement, stock prices for major health care industries (e.g., Anthem, UnitedHealth, CVS, and Walgreens) experienced a sell-off as investors worry about the implications. However, experts believe that the current market would weather the storm due to the massive operational costs necessary for the partnership to enter the health care market. Moreover, the scale of Amazon, Berkshire Hathaway, and JPMorgan Chase will not be enough to compete with larger health care industry giants that already have purchasing power.

Will this health care partnership be a game changer? Perhaps, perhaps not. But as health care economists and health policy enthusiasts, students at the CHOICE Institute will certainly be watching our neighbors with interest.

[Written with the assistance of Mark Bounthavong and Nathaniel Hendrix.]

Welcome to Incremental Thoughts, the CHOICE students’ blog

choice-who-we-are-16x9aColleagues and friends,

Welcome to our new blog! The graduate students of the Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute at the University of Washington are excited to share our experience as students in health economics and outcomes research (HEOR) with you and to learn from your experiences in return.

This project came about from a meeting of our student chapter of the International Society for Pharmacoeconomics and Outcomes Research (ISPOR), when we thought through ways to network more with other students and to find ways to engage with the conversations around HEOR taking place online. Blogs and Twitter have made the field’s luminaries easier than ever to contact, but at the meeting it emerged that many of us felt like opportunities for students to voice their unique perspective were lacking.

Thus, this blog. We’ll be featuring a broad range of articles here: upcoming research from our students, advice from our faculty for early career professionals in the field, and tips and tutorials on both established and newer methods.

Our blog is a collaborative effort, and we have many people to thank. We’re able to pay for this thanks to a grant from the ISPOR student network. And of course, the support of our professors has been essential — in particular, our senior editors Beth Devine and Ryan Hansen.

We’d love to have you along for this experiment!

Your editors,

Elizabeth Brouwer

Nathaniel Hendrix