Updated estimates of cost-effectiveness for plaque psoriasis treatments

Along with co-authors from ICER and The CHOICE Institute, I recently published a paper in JMCP titled, “Cost-effectiveness of targeted pharmacotherapy for moderate-to-severe plaque psoriasis.” In this publication, we sought to update estimates of cost-effectiveness for systemic therapies useful in the population of patients with psoriasis for whom methotrexate and phototherapy are not enough.

Starting in 1998, a class of drugs acting on Tumor Necrosis Factor alpha (TNFɑ) has been the mainstay of psoriasis treatment in this population. The drugs in this class, including adalimumab, etanercept, and infliximab, are still widely used due to their long history of safety and lower cost than some competitors. They are less effective than many new treatments, however, particularly drugs inhibiting interleukin-17 such as brodalumab, ixekizumab, and secukinumab.

This presents a significant challenge to decision-makers: is it better to initiate targeted treatment with a less effective, less costly option, or a more effective, costlier one? We found that the answer to this question is complicated by several current gaps in knowledge. First, there is some evidence that prior exposure to biologic drugs is associated with lower effectiveness in subsequent biologics. This means that the selection of a first targeted treatment must balance cost considerations with the possibility of losing effectiveness in subsequent targeted treatments if the first is not effective.

A related issue is that the duration of effectiveness (or “drug survival”) for each of these drugs is currently poorly characterized in the US context. Drug discontinuation and switching is significantly impacted by policy considerations such as requirements for step therapy and restrictions on dose escalation. Therefore, while there is a reasonable amount of research about drug survival in Europe, it is not clear how well this information translates to the US.

Another difficulty of performing cost-effectiveness research in this disease area is the difficulty of mapping utility weights onto trial outcomes. Every drug considered in our analysis used percentage change in the Psoriasis Area Severity Index (PASI) over baseline. Because this is not an absolute measure, it required that we assume that patients have comparable baseline PASI scores between studies. In other words, we had to assume that a given percent improvement in PASI was equivalent to a given increase in health-related quality of life. This means that if one study’s population had less severe psoriasis at baseline, we probably overstated the utility benefit of that drug.

In light of these gaps in knowledge, our analytic strategy was to model a simulated cohort of patients with incident use of targeted drugs. After taking a first targeted drug, they could be switched to a second targeted drug or cease targeted therapy. We made the decision to limit patients to two lines of targeted treatment in order to keep the paper focused on the issue of initial treatment.

pso cost effectiveness frontier

What we found is a nuanced picture of cost-effectiveness in this disease area. In agreement with older cost-effectiveness studies, we found that infliximab is the most cost-effective TNFɑ drug and, along with the PDE-4 inhibitor apremilast, is likely to be the most cost-effective treatment at lower willingness-to-pay (WTP) thresholds. However, at higher WTP thresholds of $150,000 per quality-adjusted life year and above, we found that the IL-17 inhibitors brodalumab and secukinumab become more likely to be the most cost-effective.

The ambiguity of these results suggests both the importance of closing the gaps in knowledge mentioned above and of considering factors beyond cost-effectiveness in coverage decisions. For example, apremilast is the only oral drug we considered and patients may be willing to trade lower effectiveness to avoid injections. Another consideration is that IL-17 inhibitors are contraindicated for patients with inflammatory bowel disease, suggesting that payers should make a variety of drug classes accessible in order to provide for all patients.

In summary, these results should be seen as provisional, not only because many important parameters are still uncertain, but also because several new drugs and biosimilars for plaque psoriasis are nearing release. Decision-makers will need to keep an eye on emerging evidence in order to make rational decisions about this costly and impactful class of drugs.

Open source value assessment needs open source economics

IVI_logoThree members of the Innovation and Value Initiative (IVI) recently published a paper entitled “Open-Source Tools for Value Assessment: A Promising Approach” in the Journal of Clinical Pathways. This paper lays out, in brief, some of the ways that open-source models can contribute to the challenging environment in which value assessment operates in the US.

Unlike many nations where cost-effectiveness analysis is widely used and accepted, the US has a highly decentralized healthcare system. Even when up to date US-based models are available, they are likely not applicable to every patient population. This matters because not only does treatment response vary between populations, but so does the conception of value.

Meanwhile, healthcare decision makers must assess what evidence on value exists while simultaneously trying to assess its applicability to their patients, all without robust guidance on how to adapt the conclusions of modeling studies.

IVI has tried to change this by releasing an open-source microsimulation model for rheumatoid arthritis – a common disease whose treatment with biologics has become a significant driver of drug costs for many payers. This model is extremely flexible and speaks to the needs of healthcare decision makers by allowing for modification of treatment sequences, elements considered in the definition of value, and even whether results are formatted as a cost-effectiveness analysis or a multi-criteria decision analysis. Better still, this software is released as both a convenient web-app and as an R package with fully open code.

This is a tremendous step forward for value assessment in the US and sets a new standard for openness in modeling. Still, I can’t help but wonder how this transition from proprietary, closed models to open models will be funded. After all, IVI is in a unique position, with funding from many large pharmaceutical companies and industry organizations. If every consulting company had to organize a consortium to fund its open-source modeling initiatives, this would quickly become very burdensome.

As the “Open-Source Tools” paper points out, IVI took its inspiration for its rheumatoid arthritis model from open-source software, and we can do the same in thinking about how open-source modeling efforts could be supported. Some companies who develop open-source software support themselves by offering paid support plans for their products. A typical example here would be Canonical, which develops the Ubuntu Linux distribution. While it offers its operating system for free to anyone who wants it, it also offers paid plans that include help with deployment and maintenance.

It’s hard to know whether the scale of a typical model’s distribution would allow for this source of income, though. While Linux users number in the millions, a typical value model may have just dozens of users. Competition is likely to be important to motivate the timely development and updating of models, but the question of funding needs to be solved before more developers can take part.

The real value of an open source model depends too on the data it uses. To truly customize a model to a patient population, more granular data on patient response needs to be made available from clinical trials and disease registries. Until this happens, the conclusions of models may be based on estimated shifts in response from small samples.

The shift toward open-source modeling is an important means of responding to the challenges presented by the US healthcare market. However, many problems remain unsolved that for now still prevent more models from being developed in an open and flexible way.

Health economics in five words

There was recently a Twitter trend of people trying to describe programming in five words. The responses ranged from funny to puzzling to inspiring.

At our celebration of the end of the academic year, students, post-docs and faculty of the CHOICE Institute decided held a similar contest at our weekly seminar, and instructed attendees to “describe health economics and outcomes research (HEOR) in five words.” Here are a few of my favorite entries:

  • “How to hurt less, cheap.” (Samantha Clark)
  • “Math predicting value of medicine.” (Blythe Adamson)
  • “Examining well-being trade-offs and technology.” (Doug Barthold)
  • “Yo, treat sick people cost-effectively.” (Shuxian Chen)
  • “To each his own evaluation.” (Nobody claimed this one, unfortunately!)
  • “What is beyond opportunity cost?” (Enrique Saldarriaga)

As researchers in HEOR, the challenge of trying to explain what we do to outsiders is familiar to us all. That’s why it was really fun to try to encapsulate our field into just a few words.

Biosimilar Litigation in the United States: Redefining the Patent Dance

By Simi Grewal, MHS, PhD Student

In recent years, the U.S. pharmaceutical world has been abuzz with the emergence of biosimilars—products that are very similar, but not identical, to a reference biologic product. To date, twelve biosimilars have received FDA approval under the Biologic Price Competition and Innovation Act (BCPIA), which was passed by Congress in 2010. However, among those biosimilars that have been licensed by the FDA, only two have been marketed. While many factors influence the time lag between FDA approval and biosimilar marketing, complex patent litigation may well contribute to delays in market launch. So, let’s explore the intricate information exchange surrounding FDA biosimilar application reviews and key litigation decisions in the biosimilar landscape.

biologics
Image credit: Promega

What makes biosimilars different from generics?

First, it’s important to understand what makes biosimilars and their reference biologic products so unique. Unlike a generic for a small molecule drug, a biologic is manufactured in a living system—a complex process which is extremely challenging to exactly replicate and thus yields products that are similar to, but not exact copies of, a reference biologic. The process is also expensive. On average, R&D estimated costs for biosimilar range from $40 to $300 million and can take up to five years, whereas small molecule generics cost $2 to $5 million in R&D and can take up to three years.

What is the biosimilar “patent dance?”

With the BCPIA, Congress has made a strong effort to help improve affordability and accessibility of clinically powerful biologics. In a sense, they have sought to improve upon the Hatch-Waxman Act used for generics, by considering the unique issues that may arise as biologics reach the expiration of their 12-year patent life. Part of this intricate vetting structure is termed the “patent dance.” Indeed, the act specifies several steps to follow. 1.) After the FDA accepts an abbreviated Biologics License Application (aBLA), the BCPIA stipulates that the biosimilar maker “shall” provide its aBLA and manufacturing information to the reference biologic maker. 2) The reference biologic maker sends a list of patents that may be infringed by the biosimilar maker. 3) The biosimilar maker provides its responses. The steps continue until contentions are resolved.

Additional components of the dance are also in play. For example, the BCPIA indicates that biosimilar makers must provider 180 day notice prior to marketing their product. The guidance may have been intended to aid in resolving disputes before biosimilar market launch and assessment of damages (i.e. losses to either party from potential revenue of marketed products) complicated litigation. However, it has been criticized that if the biosimilar makers are only allowed to provide notice after FDA approval of their biosimilar, the provision essentially extends the reference biologic patent and delays market availability for a competitor. Several biosimilar makers are now providing their notice prior to FDA approval and the notice has been a component of law suits brought against biosimilar makers.

It may seem confusing that a patent dispute surrounding a single biosimilar product can become so complicated. But it’s important to consider how patents function with biologics. The patent for a new chemical entity is well-understood, but many other aspects surrounding the manufacturing process and product use can be patented—and ultimately disputed—for a biologic. The expansive landscape can lead to tens of patents surrounding a single product. AbbVie’s Humira®, for example, has recently received a great deal of attention for its protection with over 100 patents related to the product.

How have information disclosure and patent litigation for biosimilars played out so far?

Experience to date has revealed that while some biologic manufacturers follow patent dispute guidance, others seems to be setting new steps or circling around those laid out in the BCPIA altogether. In the Amgen v Sandoz case, which began in 2014 and was ultimately resolved in 2017, Sandoz refused to provide its aBLA for Zarxio®—a biosimilar to Amgen’s Neupogen®. Amgen then sued under both federal and California state law. The case ultimately landed in the Supreme Court and led to a key decision—compliance with the BCPIA’s information disclosure (i.e. the “patent dance”) cannot be enforced under both federal and state law. Instead, if a biosimilar maker does not follow the patent dance, a reference biologic maker can then sue the biosimilar maker for patent infringement. One of the more recent cases Amgen v. Adello again involves a biosimilar for Amgen’s Neupogen®. The suit by Amgen, submitted in March 2018, is essentially blind (i.e. does not specify all patent infringements by Adello) due to minimal information disclosure by Adello. In addition, Adello addresses another flex point of the BCPIA: the 180 day notice for marketing. Adello has provided this notice prior to the FDA’s approval of the biosimilar. Whether or not this marketing notice can only be provided before or after FDA approval remains a further point of contention in interpreting the Act.

What’s next in biosimilar patent litigation?

With the BCPIA in its nascent stages, we are prone to see its application become re-defined in years to come, just at Hatch-Waxman evolved in the generics market. Currently, biosimilar and reference biologic makers engage with the act’s provisions after careful consideration of how it will impact their products’ time on market and future products’ regulatory and marketing success. In the meantime, legislators are also assessing whether or not the structure of the BCPIA adequately provides a framework for achieving Congress’s goal of increasing biologic affordability and accessibility in the U.S.

Looking ahead, at least eight additional potential patent disputes are anticipated in 2018. Actions taken by private parties and stakeholders in the U.S. government will continue to define how the BCPIA is interpreted and applied in the important biologics space.

Test performance estimates without a gold standard: a short tutorial on JAGS

800px-Bayes'_Theorem_MMB_01
Photo credit: Flickr user mattbuck

One of the most unique applications of Bayesian statistics is in finding estimates for unknown values that depend upon other unknown values. By taking advantage of the Bayesian ability to integrate prior knowledge into its models, you can develop parameter estimates using priors that are little more than a guess.

This application of Bayesian statistics is commonly seen in diagnostics. When there isn’t a gold standard test that allows simple comparisons, Bayesian models are able to use data on test results to estimate the performance of these tests and the prevalence of the disease. Whether it’s a new test or a new population where the test is unproven, these analyses allow us to glimpse important aspects of diagnostic usage with only scant data.

The pioneering paper that developed these methods is titled “Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard” by Lawrence Joseph, Theresa Gyorkos, and Louis Coupal. They collected the results of two tests for the Strongyloides parasite among Cambodian immigrants to Canada in the 70s. Since there was no knowledge of how common the parasite was in this group, they used an uninformative prior for its prevalence, but were able to solicit vague priors about the two tests’ performance from clinical experts. From these priors they built distributions which they then ran through a Gibbs sampler.

A Gibbs sampler is a program that runs repeated sampling to find the parameters – in our case, test performance and prevalence – that would make the most sense in light of the data we have. Because of the way that the sampler moves from parameter estimate to parameter estimate, it devotes most of its samples to high likelihood scenarios. Therefore, the parameter estimates are essentially histograms of the number of samples that the algorithm has run for each parameter value.

JAGS is a commonly used Gibbs sampler, and its name stands for “Just Another Gibbs Sampler.” It’s not the only one, but its got a convenient R interface and a lot of literature to support its use. I recently used JAGS in a tutorial on its R interface that recreates the Joseph, Gyorkos, and Coupal paper. You don’t need any datasets to run it, as you can easily simulate the inputs of the two Strongyloides tests.

The first part deals with gathering estimates from the two different parasite tests independently.  This means building models of the test results as Bernoulli samples of a distribution that depends on the tests’ sensitivity and specificity, as well as the disease prevalence.

The second half of the tutorial deals with learning to use the data from the two tests together. This is significantly more complex, as we need to model the joint probability of each possible combination of the two tests together. To do this, we’ll need to read in the results of the tests on each patient. However, since we’re reading in the results directly, we can’t assign a distribution to them. Rather, we’ll learn to create a likelihood that is directly observed from the data and to ensure that our new likelihood affects the model.

To learn more and see the full details, go check out the tutorial on my GitHub page and feel free to ask me any questions that come to mind!

Exit interview: CHOICE alumnus Solomon Lubinga

Editor’s note: This is the second in an ongoing series of interviews we’ve planned for the students graduating from the CHOICE Institute where we’ll get their thoughts on their grad school and dissertation experiences.

Solomon Lubinga is pharmacist and an applied health economist. After graduating with his PhD from the Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute in 2017, he became a senior fellow at the CHOICE Institute at the University of Washington, working with Dr. Josh Carlson in collaboration with the Institute for Clinical and Economic Review (ICER). He is interested in decision modelling, value of information/implementation research as well as the econometric and health policy applications of discrete choice models.

lubinga_solomonFor more details on Solomon’s work, check out his personal webpage at http://www.jonslo.com

  • What was your dissertation about?

In my dissertation, I abstracted from

well-known economics and socio-psychology

decision theories to study the incentives that drive the uptake of medical male circumcision (MMC) for HIV prevention in Uganda. My hypothesis was that a model that would combine factors from both decision theories would not only more accurately predict MMC decisions, but also be a very powerful tool to predicting the potential impacts of different MMC demand creation strategies.

  • How did you arrive at that topic? When did you know that this is what you wanted to study?

I became interested in the intersection of economics and psychology early on in the PhD program. I suppose this was because of my own proclivity to act irrationally even though I considered myself a well-informed person. This led me to ask why individuals in lower income countries in general do not value preventive health interventions. This specific topic built on a prior contingent valuation study estimating willingness to pay (WTP) and willingness to accept payment (WTAP) for safe MMC among men in high-HIV-risk fishing communities in Uganda. The results of this analysis indicated low demand (WTP) and high potential incentive value (WTAP) for MMC, suggesting that a high WTAP (a defacto increase in MMC price) may result in an unfavorable incremental cost-effectiveness or benefit-to-cost ratio for MMC. I was therefore interested in studying the relative roles of economic and psychological incentives on demand for MMC.

  • What was your daily schedule like when you were working on your dissertation?

I never had a set schedule while I worked on my dissertation. I was also a teaching assistant (TA) for the online health economics course offered by CHOICE. I spent a lot of time in Uganda collecting my data. I would spend the day in the field (8:00am – 5:00pm) and the evenings (7:00pm – 11:00pm) performing my TA duties. It turned out that this was convenient (but challenging) because of the time difference between Uganda and the west coast. I also travelled to the UK twice for a choice modelling course, which was a great help with my dissertation. When I was in Seattle, I generally combined work on my dissertation with my teaching assistant responsibilities at the UW, with no set schedule. I simply gave what was more urgent the priority.

  • If you are willing to share, what was the timeline for your dissertation? And what determined that timeline?

I submitted my short proposal sometime in April, 2015. I defended my dissertation in August, 2017. Two major factors determined my timeline. First, the death of a close family member motivated me to take some personal time. Second, although I was fortunate to receive funding for my data collection activities, it took almost 8 months (between December 2015 and October 2016) to receive international clearance for the data collection activities.

  • How did you fund your dissertation?

As I mentioned, I was fortunate to receive funding for my data collection activities through a grant awarded to my dissertation advisor.

  • What will come next for you (or has come next for you)? What have you learned about finding work after school?

I am interested in academic positions in universities in the US, or other quasi-academic institutions (e.g., research institutes or global organizations that conduct academic-style research). As an international student, the main lesson I have learned is “to synchronize the completion of your studies with the job market cycle, especially if you are interested in academic positions in the US”.

Is there still value in the p-value?

not sure if significantDoing science is expensive, so a study that reveals significant results yet cannot be replicated by other investigators, represents a lost opportunity to invest those resources elsewhere. At the same time, the pressure on researchers to publish is immense.

These are the tensions that underlie the current debate about how to resolve issues surrounding the use of the p-value and the infamous significance threshold of 0.05. This measurement was adopted in the early 20th century to indicate the probability that the observed results are obtained by chance variation, and the 0.05 threshold has been with it since the beginning, allowing researchers to declare as significant any effect they find that can cross that threshold.

This threshold was selected for convenience in a time when computation of the p-value was difficult to calculate. Our modern scientific tools have made calculation so easy, however, that it is hard to defend a 0.05 threshold as anything but arbitrary. A group of statisticians and researchers is trying to rehabilitate the p-value, at least for the time being, so that we can improve the reliability of results with minimal disruption to the scientific production system. They hope to do this by changing the threshold for statistical significance to 0.005.

In a new editorial in JAMA, Stanford researcher John Ioannidis, a famous critic of bias and irreproducibility in research, has come out in favor of this approach. His argument is pragmatic. In it, he acknowledges that misunderstandings of the p-value are common: many people believe that a result is worth acting on if it is supported by a significant p-value, without regard for the size of the effect or the uncertainty surrounding it.

Rather than reeducating everyone who ever needs to interpret scientific research, then, it is preferable to change our treatment of the threshold signaling statistical significance. Ioannidis also points to the success of genome-wide association studies, which improved in reproducibility after moving to a statistical significance threshold of p < 5 x 10-5.

As Ioannidis admits, this is an imperfect solution. The proposal has set off substantial debate within the American Statistical Association. Bayesians, for example, see it as perpetuating the same flawed practices that got us into the reproducibility crisis in the first place. In an unpublished but widely circulated article from 2017 entitled Abandon Statistical Significance [pdf warning], Blakely McShane, Andrew Gelman, and others point to several problems with lowering the significance threshold that make it unsuitable for medical research.

First, they point out that the whole idea of the null hypothesis is poorly suited to medical research. Virtually anything ingested by or done to the body has downstream effects on other processes, almost certainly including the ones that any given trial hopes to measure. Therefore, using the null hypothesis as a straw man takes away the focus on what a meaningful effect size might be and how certain we are about the effect size we calculate for a given treatment.

They also argue that the reporting of a single p-value hides important decisions made in the analytic process itself, including all the different ways that the data could have been analyzed. They propose reporting all analyses attempted, in an attempt to capture the “researcher degrees of freedom” – the choices made by the analyst that affect how the results are calculated and interpreted.

Beyond these methodological issues, lowering the significance threshold could increase the costs of clinical trials. If our allowance for Type I error is reduced by an order of magnitude, our required sample size roughly doubles, holding all other parameters equal. In a regulatory environment where it costs over a billion dollars to bring a drug to market, this need for increased recruitment could drive up costs (which would need to be passed on to the consumer) and delay the health benefits of market release for good drugs. It is unclear whether these potential cost increases will be offset by the savings of researchers producing more reliable, reproducible studies earlier in the development process.

It also remains to be seen whether the lower p-value’s increased sample size requirement might dissuade pharmaceutical companies from bringing products to market that have a low marginal benefit. After all, you need a larger sample size to detect smaller effects, and that would only be amplified under the new significance thresholds. Overall, the newly proposed significance threshold interacts with value considerations in ways that are hard to predict but potentially worth watching.