Are private insurers in the United States using cost-effectiveness analysis evidence to set their formularies?

theme pic premera

By Elizabeth Brouwer

As prices rise for many prescription drugs in the United States (US), stakeholders have made efforts to curb the cost of medications with varying degrees of success. One option put forth to contain drug spending is to connect drug coverage and cost-sharing to value, with cost-effectiveness analysis being one of the primary measures of drug value.

In 2010, a payer in the Pacific Northwest implemented a formulary where cost-sharing for prescription drugs was driven by cost-effectiveness evidence. This value-based formulary (VBF) had 5 tiers based on cost-effectiveness ranges determining a patient’s copay amount, aka their level of cost-sharing (Table 1). There was allotment for special cases where a drug had no alternatives or treated a sensitive population, however a majority of the drugs fell within each of these categories. Later analysis found that this VBF resulted in a net (including both payer and patient) decrease in medication expenditures of $8 per member per month, with no change in total medication or health services utilization. A 2018 literature review found slightly different (but still optimistic) results, that value-based formulary design programs increased medication adherence without increasing health spending.


Given the potential benefits of implementing value-based cost-sharing for prescription drugs, we wanted to know if other private payers in the US were using cost-effectiveness value evidence to set their drug formularies. If private payers were “moving toward value,” we would expect to see cost-sharing for high-value drugs getting cheaper relative to cost-sharing for low-value drugs (Figure 1).


To test this theory, we used claims data from a large portion of Americans with private, employee-sponsored health insurance to find the average out-of-pocket cost for each prescription drug in each year from 2010-2013. The collapsed claims data were then linked to the value designation (or “tier”) for each drug. We used a random effects model to see how out-of-pocket costs changed each year in each cost-effectiveness category. (For more details on our methods, please check out our paper, which was recently published in PharmacoEconomics journal).

The results revealed a few interesting trends.

Cost-sharing for prescription drugs was trending toward value in those years, but in a very specific way. Average cost-sharing across all “tiers” decreased over the time frame, and drugs with cost-effectiveness ratios below $10,000 per quality-adjusted life-year (QALY) were getting cheaper at a faster rate than those with cost-effectiveness ratios above that threshold. But there was no distinction in cost-sharing for drugs within those two groups, even accounting for generic status.

Additionally, the movement toward value that we saw was largely the result of increased use of generic drugs, rather than an increased use of more cost-effective drugs. Splitting the data by generic status showed that we are not using higher value drugs within generic and brand name categories (Figure 2).


Figure 2Source

Our results indicate that there is probably space in private drug formularies to further encourage the use of higher value drugs options and, conversely, to further discourage use of lower-value drug options. This is particularly true for drugs with ICERs in the range of $10,000-$150,000 per QALY and above, where payers are largely ignoring differences in value.

One limitation of the analysis was that it was restricted to years 2010-2013. Whether private payers in the US have increased their use of value information since the implementation of the Affordable Care Act in 2014, or in response to continually rising drug prices, is an important question for further research.

In conclusion, there is evidence indicating payers have an opportunity to implement new (or expand existing) VBF programs. These programs have the potential to protect patient access to effective medical treatments while addressing issues with affordability in the US health care system.


Some challenges of working with claims databases

By Nathaniel Hendrix

Real-world evidence has become increasingly important as a data source for comparative effectiveness research, drug safety research, and adherence studies, among other types of research. In addition to sources such as electronic medical records, mobile data, and disease registries, much of the real-world evidence we use comes from large claims databases like Truven Health MarketScan or IQVIA, which record patients’ insurance claims for services and drugs. The enormous size of these databases means that researchers can detect subtle safety signals or study rare conditions where they may not have been able to previously.

Using these databases is not without its challenges, though. In this article, I’ll be discussing a few challenges that I’ve encountered as I’ve worked with faculty on a claims database project in the past year. It’s important for researchers to be aware of these limitations, as they necessarily inform our understanding of how claims-based studies should be designed and interpreted.

Challenge #1: Treatment selection bias

Treatment selection bias occurs when patients are assigned to treatment based on some characteristic that also affects the outcome of interest. If patients with more severe disease are assigned to Drug A rather than Drug B, patients using Drug A may have worse outcomes and we might conclude that Drug B is more effective. Alternatively, if patients with a certain comorbidity are preferentially prescribed a different drug than those patients without the comorbidity – an example of channeling bias – we may conclude that this drug is associated with this comorbidity.

These conclusions would be too hasty, though. What we’d like to do is to simulate a randomized trial, where patients are assigned to treatment without regard for their personal characteristics. Methods such as propensity scores give us this option, but these methods often unavailable to researchers working with claims data. This is because many disease characteristics are not recorded in claims data.

An example might clarify this: imagine that you’re trying to assess the effect of HAART (highly active anti-retroviral therapy) on mortality in HIV patients. Disease characteristics such as CD4 count would be associated with both use of HAART and mortality, but are not recorded in claims data. We could adjust our analysis for other factors such as age and time since diagnosis, but our result would be biased. It’s important, therefore, to understand whether any covariates affect both treatment assignment and the outcome of interest, and to consider other data sources (such as disease registries) if they do.

Challenge #2: Claims data don’t include how the prescription was written

The nature of pharmacy claims data is to record when patients pick up their medications. This creates excellent opportunities for studying resource use and adherence, but these data, unfortunately, lack information about when and how the prescription for these medications was written.

One effect of this is that we don’t know how much time passes between a drug’s being prescribed and when it’s first used. Clearly, if several months pass between the initial prescription and a patient finally picking up that drug from the pharmacy, that would be time spent in non-adherence. We’re not able to capture that time, though. In the case of primary non-adherence, where a prescription is written for a drug that is never picked up at all, this behavior cannot be detected, potentially interfering with our ability to understand the causes of adverse outcomes and to assess the need for interventions that can improve adherence.

Challenge #3: Errors in days’ supply

Days’ supply is essential for calculating adherence and resource use, but errors sometimes appear that can be difficult to work with. Sometimes these are clear entry errors. For example, if a technician enters 310 days instead of 30 days. The payer usually rejects claims made with unusual days’ supply, but some such claims remain in the database.

Another issue is that certain errors in the days’ supply of drugs can be impossible to interpret. For example, if a drug is usually dispensed with an 84-day supply (i.e., 12 weeks) and a claim appears that has a 48-day supply, it’s impossible to know whether the prescriber had escalated the dose or the pharmacy staff had accidentally entered the days’ supply incorrectly. This is one of several reasons why it’s important to carefully consider imposing restrictions on the days’ supply for claims if this parameter is relevant to your research.

Errors such as these can significantly impact analyses that work with days’ supply of prescriptions, so it’s essential to be proactive about looking for cases where the days’ supply is not realistic or interpretable. Consider setting a realistic range to truncate days’ supply before you undertake your analysis.

Challenge #4: Generalizing results from claims studies can be difficult

Claims databases are usually grouped by insurance type. For example, the commercial claims database only contains encounters by commercially-insured patients and their dependents while excluding patients insured by Medicare and/or Medicaid. They may also only include Medicare patients with supplementary insurance. Separating these populations into different databases can make it difficult and sometimes unaffordable for researchers to produce generalizable results as well as introducing complexity due to the need for merging databases.

These populations are all quite different from each other: commercially-insured enrollees are generally healthier than Medicaid enrollees of the same age. And the “dual-eligibles” – enrollees in both Medicare and Medicaid – are different from individuals enrolled in just one of these programs. Since it’s costly and sometimes infeasible to capture all of these patients in a single analysis, you may need to hone your research question carefully so it can be answered by a single database instead of trying to access them all. Fortunately, sampling weights are now common, which helps generalize within your age and insurance grouping even if they are somewhat cumbersome to work with.

In summary, claims databases have added immeasurable value to several fields of research by collecting information on the real-world behavior of clinicians and patients. Still, there are some significant challenges that need to be taken into account when considering using claims data. Finding a good scientific question that suits these data means understanding their limitations. These are a few of the most important ones, but anyone who works with these data long enough will be sure to discover challenges unique to their own research program.

A target for harm reduction in HIV: reduced illicit drug use is associated with increased viral suppression.

By Lauren Strand

In the midst of a fatal drug epidemic and shifting drug policy in the United States, there is continued interest in the relationship between illicit drug use and negative health outcomes. Because substance use is difficult to characterize in individuals, studies often target sub-populations with more substantial and better-documented substance use profiles. One example is people living with HIV, in whom substance use has been associated with poor engagement in the HIV care continuum, lower likelihood of receiving antiretroviral therapy, reduced adherence to therapy, and increased disease-related mortality. Recently I collaborated on a study finding that reduction in frequency of illicit opioid and methamphetamine use is associated with viral suppression among people living with HIV. Since viral suppression is an important consideration for individual’s health as well as disease transmission, this finding has important policy implications for harm-reduction around substance use frequency.

This work was spearheaded by Robin Nance and Dr. Maria Esther Perez Trejo and advised by Dr. Heidi Crane and Dr. Chris Delaney, colleagues and mentors of mine during my time at the Collaborative Health Studies Coordinating Center (University of Washington). The publication in Clinical Infectious Diseases focuses on the longitudinal relationship between reducing illicit drug use frequency and a key biomarker in HIV, viral load (VL), among people living with HIV. This study used longitudinal data from the Centers for AIDS Research Network of Integrated Clinical Sites (CNICS) cohort. CNICS is an ongoing observational study consisting of more than 35,000 people living with HIV receiving primary care at one of eight sites (Seattle, San Francisco, San Diego, Cleveland, Chapel Hill, Birmingham, Baltimore, and Boston). Importantly, CNICS provides peer-reviewed open access to patient data including clinical outcomes, biological data, and patient-reported outcomes. This study also used individual data in four studies from the Criminal Justice Seek, Test, Treat, and Retain (STTR) collaboration. STTR is an effort to combines data from involved observational studies and trials to improve outcomes along the HIV care continuum for people involved in some part of the criminal justice system. One example is individuals recently released from jail who have struggled in the past with substance use disorders.

Within CNICS, substance use was collected at clinical assessment via tablets approximately every six months with instruments including the modified Alcohol, Smoking, and Substance Involvement Screening Test and the Alcohol Use Disorders Identification Test. Drug use was defined as frequency of use in the last 30 days and was further categorized according to longitudinal trends from baseline: abstinence (no use at baseline or follow-up), reduction in use without abstinence (use at baseline that has declined at follow-up), and non-decreasing (similar or increased use). Drug categories were marijuana, cocaine/crack, methamphetamine, and heroin/other illicit opioids.  Viral suppression was defined as an undetectable VL (<=400 copies/mL). Analytic models for each individual drug were joint longitudinal and survival models with time-varying substance use and adjustment for demographics, follow-up time, cohort entry year, and other concomitant drugs including alcohol and binge alcohol. These longitudinal models account for repeated measures and differential loss to follow-up (unbalanced panels).

Analyses (mean follow-up of 3.9 years) included approximately 12,000 people living with HIV with a mean age if 44 and of whom 47% were white. Marijuana was widely used at baseline, though methamphetamine was also common. Relative to non-decreasing use, abstinence was associated with an increase in odds of viral suppression ranging from 42% for marijuana to 118% for opioids (all four substance groups statistically significant). Reduction in use was associated with an increase of 65% for methamphetamine and 172% for opioids. The directionality and statistical significance of these results were maintained in sensitivity analyses with pooled fixed effects meta-analysis using both CNICS and STTR studies.

Ultimately, findings from this large-sample longitudinal analysis suggest that abstinence of all drug groups increases the likelihood of viral suppression and, more interestingly, reducing frequency without abstinence may also increase the likelihood of viral suppression for illicit opioids and methamphetamine. This finding may support the use of medication-assisted treatments (MAT) to reduce substance use, which could have the potential to improve disease-related outcomes for people living with HIV. However, this study did not evaluate why individuals may have increased or decreased use of illicit substances (e.g. MAT, or other treatment programs). In any case, reduction of illicit substance like opioids and meth (even when abstinence is not achieved) seems like a logical target for harm reduction interventions in people living with HIV and likely, in the broader population, to improve overall health outcomes.

One extension of this work would be to evaluate the relative value of programs targeting abstinence and substance use reduction among individuals with HIV compared with other programs. This, of course, requires a true causal relationship between substance use and viral load, which is likely mediated through ART adherence. A simple Markov model could include states for suppressed and not suppressed; however, because suppression reduces the risk of transmission, we might also incorporate shifting dynamics of the population of people living with HIV. Both transmission and individual outcomes were considered in a recent cost-effectiveness analysis of financial incentives for viral suppression authored by CHOICE Alumna Dr. Blythe Adamson and Professors, Dr. Josh Carlson and Dr. Lou Garrison. The main study finding was that paying individuals to take HIV medications was associated with health improvement, reduced transmission, and reduced healthcare costs. While this finding is fascinating, substance use may be an important contextual consideration. One previous study found that financial incentives did not improve viral suppression among substance users and it is unclear how financial incentives may impact drug use and addiction. This is an active area of research and debate. Our study did not look at increases in substance use and viral suppression because we wanted to address the question around reduction and abstinence. Regardless, additional research on strategies to improve viral suppression are needed as well as a better understanding of the interplay between substance use behavior, other risk behaviors, adherence, and viral suppression among people living with HIV.

Economic evaluation of New Rural Cooperative Medical Scheme in China

china pic

By Boshen Jiao

In China, while the private health insurance is growing rapidly, the government-funded basic health insurance still dominates the health care landscape. Chinese government defines three types of beneficiaries: urban employees, urban residents, and rural residents. Accordingly, three main types of healthcare coverage plans were implemented in China: the Urban Employee Basic Medical Insurance, the Urban Resident Basic Medical Insurance, and the New Rural Cooperative Medical Scheme (NCMS).

The NCMS, which was initiated in 2003 and financed by both governments and individuals, was specifically designed for rural residents in China. In some sense, the Chinese government can feel proud since 98% of the rural residents are covered and this, undoubtedly, has been viewed as a great success. In particular, many of the newly covered individuals are considered to be poor and underserved, with a long history of struggling for access to basic health care.

However, the health and economic consequence of the NCMS might not be that pleasing. While the effectiveness for mortality reduction remained controversial based on current scientific evidence, the NCMS resulted in a 61% increase in out-of-pocket spending. Given the fact that the NCMS has finite resources and impacts a large number of lives, it was critical to do a “thought experiment” and assess the cost-effectiveness of the NCMS. This is the subject of a paper I recently published with Dr. Jinjing Wu from the Asian Demographic Research Institute at Shanghai University and several coauthors from the Columbia Mailman School of Public Health. This paper, titled “The cost-effectiveness analysis of the New Rural Cooperative Medical Scheme in China,” was recently published in PloS One.

Initial estimates of NCMS’s effect on mortality were based on quasi-experimental studies that produced conflicting results. Some argued that NCMS significantly decreased the death rate among the elderly in the eastern region, while the other study using a nationally representative sample concluded to have no statistically significant effect. Although it was tempting to embrace the favorable results, our investigators decided to take on the less-favorable study. We made this call mainly because the nationally representative sample was derived from the Disease Surveillance Point system which was widely accepted as a very reliable data source. Besides, we hoped to draw from the whole country, rather than only focusing on East China where more economic resources and better healthcare are offered. In addition to the effect on mortality rate, the NCMS had proved to successfully lower the risk of hypertension, which was also included as an effectiveness parameter in our model.

Because of uncertainty around its effect on rural residents’ survival, it is likely that the NCMS is not cost-effective. Based on our analysis, the NCMS can only buy one more QALY for rural residents at the social price of 71,480 international (Int) dollars (Note: the costs and economic benefits were converted into 2013 Int dollars using purchasing power parity exchange rate reported by the World Bank). This is not optimal for China. If we believe that three times per capita GDP can be a fair willingness-to-pay threshold (Int$845,659), the NCMS had only a 33% chance to be cost-effective. The results were not surprising, however, nonetheless disappointing.  One possibility that we did not explore is that the elderly benefit the most from NCMS. Using a nationally representative sample, however, the NCMS is plausibly costly for the society and failed to produce sufficient health benefits.

We discussed the reasons why the NCMS appears to be inefficient. Current literature described the NCMS as providing catastrophic coverage that mostly covers inpatient services. People may barely use the preventative care or other necessary outpatient services, which would plausibly lead to severe illness and costly complications in the future. Moreover, the NCMS is associated with high copayments, which restricts low-income rural residents’ access to health care and fails to reduce out-of-pocket expenses. We concluded that, while the Chinese government indeed achieved a great success in coverage expansion, the program’s efficiency should be a consideration for future improvements. In order to achieve this goal, cost-effectiveness analysis could be a useful tool when designing the plan.

Our study presented an overall picture of the cost-effectiveness of the NCMS, in which the effect was estimated based on an aggregated of the data collected from different regions. However, the heterogeneity across the regions, particularly at the county level, would need to be taken into account for the future study. This is because the county governments play a critical role in financing for the NCMS, and their budget constraint for the plan has a fundamental effect on the design and implementation of it. As a consequence, the health outcome of NCMS may vary dramatically across the counties. Our analysis would have been enriched and would have provided more informative policy implications if the county level data can be obtained.

Updated estimates of cost-effectiveness for plaque psoriasis treatments

Along with co-authors from ICER and The CHOICE Institute, I recently published a paper in JMCP titled, “Cost-effectiveness of targeted pharmacotherapy for moderate-to-severe plaque psoriasis.” In this publication, we sought to update estimates of cost-effectiveness for systemic therapies useful in the population of patients with psoriasis for whom methotrexate and phototherapy are not enough.

Starting in 1998, a class of drugs acting on Tumor Necrosis Factor alpha (TNFɑ) has been the mainstay of psoriasis treatment in this population. The drugs in this class, including adalimumab, etanercept, and infliximab, are still widely used due to their long history of safety and lower cost than some competitors. They are less effective than many new treatments, however, particularly drugs inhibiting interleukin-17 such as brodalumab, ixekizumab, and secukinumab.

This presents a significant challenge to decision-makers: is it better to initiate targeted treatment with a less effective, less costly option, or a more effective, costlier one? We found that the answer to this question is complicated by several current gaps in knowledge. First, there is some evidence that prior exposure to biologic drugs is associated with lower effectiveness in subsequent biologics. This means that the selection of a first targeted treatment must balance cost considerations with the possibility of losing effectiveness in subsequent targeted treatments if the first is not effective.

A related issue is that the duration of effectiveness (or “drug survival”) for each of these drugs is currently poorly characterized in the US context. Drug discontinuation and switching is significantly impacted by policy considerations such as requirements for step therapy and restrictions on dose escalation. Therefore, while there is a reasonable amount of research about drug survival in Europe, it is not clear how well this information translates to the US.

Another difficulty of performing cost-effectiveness research in this disease area is the difficulty of mapping utility weights onto trial outcomes. Every drug considered in our analysis used percentage change in the Psoriasis Area Severity Index (PASI) over baseline. Because this is not an absolute measure, it required that we assume that patients have comparable baseline PASI scores between studies. In other words, we had to assume that a given percent improvement in PASI was equivalent to a given increase in health-related quality of life. This means that if one study’s population had less severe psoriasis at baseline, we probably overstated the utility benefit of that drug.

In light of these gaps in knowledge, our analytic strategy was to model a simulated cohort of patients with incident use of targeted drugs. After taking a first targeted drug, they could be switched to a second targeted drug or cease targeted therapy. We made the decision to limit patients to two lines of targeted treatment in order to keep the paper focused on the issue of initial treatment.

pso cost effectiveness frontier

What we found is a nuanced picture of cost-effectiveness in this disease area. In agreement with older cost-effectiveness studies, we found that infliximab is the most cost-effective TNFɑ drug and, along with the PDE-4 inhibitor apremilast, is likely to be the most cost-effective treatment at lower willingness-to-pay (WTP) thresholds. However, at higher WTP thresholds of $150,000 per quality-adjusted life year and above, we found that the IL-17 inhibitors brodalumab and secukinumab become more likely to be the most cost-effective.

The ambiguity of these results suggests both the importance of closing the gaps in knowledge mentioned above and of considering factors beyond cost-effectiveness in coverage decisions. For example, apremilast is the only oral drug we considered and patients may be willing to trade lower effectiveness to avoid injections. Another consideration is that IL-17 inhibitors are contraindicated for patients with inflammatory bowel disease, suggesting that payers should make a variety of drug classes accessible in order to provide for all patients.

In summary, these results should be seen as provisional, not only because many important parameters are still uncertain, but also because several new drugs and biosimilars for plaque psoriasis are nearing release. Decision-makers will need to keep an eye on emerging evidence in order to make rational decisions about this costly and impactful class of drugs.