Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- RESPIRATORY MEDICINE (see Thoracic Medicine)
- STATISTICS & RESEARCH METHODS
- THORACIC MEDICINE
The Food and Drug Administration (FDA) recently approved lumacaftor 200 mg-ivacaftor 125 mg (Orkambi) to treat patients at least 12 years old who have cystic fibrosis (CF) due to two copies of the F508del (Phe508del) mutation in the cystic fibrosis transmembrane conductance regulator (CFTR) gene.1 The FDA press release noted: “Orkambi was studied in two double-blind, placebo-controlled clinical trials [with a combined total] of 1108 participants. In both studies, participants with CF who took Orkambi, two pills taken every 12 hours, demonstrated improved lung function compared to those who took placebo”. These trials, known to the FDA as 809-103 and 809-104, were also published in the New England Journal of Medicine (NEJM) shortly after being considered at the 12 May 2015 Pulmonary-Allergy Drugs Advisory Committee Meeting of the FDA.2–6
The Phe508del mutation is the most common genetic mutation resulting in CF, accounting for roughly half of all cases of CF.1–3 Since CF is rare, this drug was granted ‘orphan drug’ status, thus conferring certain benefits for the manufacturers, Vertex Pharmaceuticals (Vertex).1 It was also given a ‘breakthrough therapy’ designation.1
Looking at the FDA press release, one sees suggestions of a ‘breakthrough therapy’ with “two double-blind, placebo-controlled clinical trials [demonstrating] improved lung function compared to those who took placebo”.1 What exactly did these trials find, though?
Effect on lung function
Studies 809-103 (TRAFFIC) and 809-104 (TRANSPORT) both ran for 24 weeks, and the primary efficacy outcome for both was absolute change from baseline to week 24 in the percentage of predicted forced expiratory volume in 1 s (ppFEV1) as assessed by a mixed-effects model for repeated measures analysis (with treatment, visit and treatment-by-visit interaction as the fixed effects, and adjustments for sex, age group at baseline (<18 vs ≥18 years old) and ppFEV1 severity group at screening (<70% vs ≥70%)).2–4 Each study compared two doses of lumacaftor-ivacaftor against placebo to assess change beyond placebo, and absolute change in ppFEV1 was used to infer improvement in lung function. Using the effect estimates to derive a range of potential improvement, lumacaftor-ivacaftor improved ppFEV1 anywhere from 2.5–2.6% (lower bound) to 4.0–4.1% (upper bound) depending on the dose used, the study being considered, and if correcting for inconsistencies (note this uses the effect estimates only and not their associated confidence intervals (CIs)).2–4 Pooling the studies yielded an improvement of 2.8–3.3%.2–4 Vertex recommends 400–250 mg q12h, which would mean the estimate of absolute improvement in ppFEV1 would be between 2.6% and 3.0% (TRAFFIC, 2.6%; TRANSPORT, 3.0%; pooled, 2.8%).2–4
How much is that worth to a patient with CF? The undeniably small improvement in ppFEV1 seems unconvincing with respect to providing clinically-meaningful benefit. This is corroborated by the marginal and ultimately non-significant changes in the patient-reported Cystic Fibrosis Questionnaire-Revised (CFQ-R) respiratory domain scores for the recommended dose.2–4 For some, however, perhaps any degree of improvement is ‘worth it.’ So, again, how much is it worth? The answer would have to be $259 000 a year, because Vertex plans to charge this much.7
Although the improvement in ppFEV1 might be of questionable benefit, there were a number of secondary outcomes, and perhaps considering these might change the overall appraisal. A key consideration with secondary outcomes is preventing type I error, and in an effort to do so, the authors prespecified a simple Bonferroni-adjusted p value threshold, yielding α=0.025 (0.05/2, accounting for testing two doses). The authors also prespecified a hierarchical testing paradigm where key secondary outcomes were prioritised and a given secondary outcome was considered significant if its p value was less than 0.025 and all preceding secondary outcomes also had a p value less than 0.025. If a given secondary outcome had a p value greater than 0.025, the testing hierarchy broke at that level, and that outcome and all subsequent outcomes were considered non-significant.2–4
Cystic Fibrosis Questionnaire-Revised
Change in CFQ-R respiratory domain score was one of the secondary outcomes in TRAFFIC and TRANSPORT. The CFQ-R respiratory domain is a scale from 0 to 100 with higher scores indicating better patient-reported quality of life with respect to respiratory status. The increases in CFQ-R respiratory domain scores for the recommended dose are as follows: TRAFFIC, 1.5 points (95% CI −1.7 to 4.7; p=0.36); TRANSPORT, 2.9 points (95% CI −0.3 to 6.0; p=0.07); pooled data, 2.2 points (95% CI 0.0 to 4.5; p=0.0512).2 ,4 ,6 Not only do these effect estimates fail to reach statistical significance, but they also fall below the minimal clinically-important difference (MCID) for the CFQ-R respiratory domain.8 The authors of the NEJM publication point out the MCID study8 had “patients who had markers of advanced disease, which complicates its application to other populations” (ref. 2, p. 10). Indeed, in the portion of the MCID study that considered stable patients (thus coinciding with the patients in TRAFFIC and TRANSPORT), the children (N=14, accounting for 10% of the stable participants) had lower baseline mean ppFEV1 values than are typically seen in that age group. Likewise, considering all the stable participants in the MCID study (including children, adolescents and adults), the baseline mean ppFEV1 values were also somewhat lower than those seen in the participants of TRAFFIC and TRANSPORT (ppFEV1 of 53.1% vs 60.5% and 60.6%, respectively).2 ,8 Comparison of CFQ-R respiratory domain scores is precluded by lack of baseline CFQ-R respiratory domain scores in TRAFFIC and TRANSPORT.2–4 The severity of disease in the population in which an MCID is established is important, as generalisability could be affected by potentially skewing the MCID threshold. To account for extremes of disease severity, the MCID study also investigated potential ceiling or floor effects by excluding patients with CFQ-R respiratory domain scores less than 10 or greater than 90; in all variants of ceiling and floor analysis of the stable patients, the MCID increased anywhere from 0.6 to 2.3 points.8 Even considering all this, and although none of the CFQ-R respiratory domain measures in TRAFFIC, TRANSPORT or the pooled data reached statistical significance, some might point to a trend in TRANSPORT and the pooled data. Considering ‘trends’ is complex and potentially problematic, but even if disregarding methodological concerns and allowing such consideration for the sake of argument, does a potential increase of 2.2–2.9 points on the 100-point CFQ-R respiratory domain scale (remembering the 95% CIs include no improvement or detriment) warrant a ‘breakthrough’ designation with a price tag of $259 000 per year?
Body mass index
Body mass index (BMI) was also one of the secondary outcomes for which benefit was reported in the NEJM publication, but only for TRANSPORT and the pooled analysis.2–5 TRANSPORT found significant benefit for BMI, with the 400–250 mg q12h dose recommended by Vertex causing an increase in BMI of 0.36 kg/m2 (the other dose caused a 0.41 kg/m2 increase in BMI). Using median baseline data from table 9 of the clinical briefing document in the FDA documentation,3 one can determine the BMI increase of 0.36 kg/m2 would translate to a median weight increase of approximately 0.98–1.02 kg (the 0.41 kg/m2 BMI increase would translate to approximately 1.13–1.16 kg).2–4 The pooled data for the 400–250 mg q12h dose yielded a BMI increase of 0.24 kg/m2, which translates to approximately 0.65-0.68 kg (the other dose yielded a 0.28 kg/m2 BMI increase, translating to approximately 0.77–0.79 kg).2–4 Do these amounts of potential weight gain warrant a ‘breakthrough’ designation with a price tag of $259 000 per year?
Also reported was a significant reduction in pulmonary exacerbations, which was another secondary outcome.2 ,4 ,6 Based on the prespecified hierarchical testing strategy aforementioned, in neither individual study were pulmonary exacerbations considered statistically significant (despite an admitted nominal improvement in pulmonary exacerbations).2 ,3 However, in the supplemental content accompanying the NEJM publication, intent to use the pooled data to assess pulmonary exacerbations is prespecified as the primary method of analysis, but this is the only outcome for which use of pooled data was prespecified as the primary method of analysis.2 Zeng's statistical review presentation for the FDA notes the same, quoting part of what appears in the aforementioned supplemental content.5 Intriguingly, however, such intent is not made clear in the actual FDA briefing document or the briefing document supplied by Vertex during the FDA review; discussion of pooling occurs, but not that this was actually prespecified as being the primary analysis for pulmonary exacerbations.3 ,4 The ClinicalTrials.gov registries (NCT01807923 and NCT01807949) do not address this either, and it is clear the FDA review focused on the individual trials, not the pooled analyses.2 ,3 ,5 It remains unclear why the pooled data for pulmonary exacerbations were not considered by the FDA and why Vertex did not push the FDA to do so, since Vertex could have potentially garnered a broader official indication for lumacaftor-ivacaftor aside from modestly improving ppFEV1.3–6
Pooling is certainly reasonable, even for the analyses where pooling was not specified a priori as the primary method for analysis. The trials were quite comparable, and although some results were discrepant between the trials, the pooling simply serves as a summarisation of the individual trials, which can—and should—be considered in the context of the individual trials and the critical appraisal thereof. Based on the prespecified methods, the pooled data show a significant reduction in exacerbations.2 ,4 ,6 The researchers doubled the 24-week data and then adjusted via negative binomial regression analysis (with sex, age and baseline ppFEV1 as dichotomous covariates with log of time spent in the study as the offset) to arrive at a final estimate for rate of pulmonary exacerbations per 48 weeks. Using this methodology, the pooled placebo group had a 48-week pulmonary exacerbation rate of 1.14, and the group receiving 400–250 mg q12h had a 48-week rate of 0.70. Readers are not provided with the stratified Wilcoxon rank-sum test the protocol said would be performed as a sensitivity analysis for these results.
If using these results, one must be judicious in applying them to an individual patient. Typical calculations of absolute risk reduction (ARR) and number needed to treat to benefit (NNT, NNTB) are proscribed given the methodology. For instance, it would be erroneous to calculate the NNTB over 48 weeks as 1/(1.14–0.70)=1/0.44=2.27, because the data provided concern cumulative number of exacerbations, not number of people with an exacerbation. In order to provide a given patient insight about potential individual benefit with respect to pulmonary exacerbations, one must know or estimate the patient's baseline rate of pulmonary exacerbations over a 48-week period, which could then be used with the rate ratio to determine potential individual benefit over 48 weeks. For instance, the rate ratio for the above data is 0.61 (≈0.70/1.14); for a patient with a rate of five exacerbations over a 48-week period, his/her rate might be reduced to about three exacerbations over a 48-week period (5×0.61=3.05) by taking lumacaftor-ivacaftor 400–250 mg q12h.
Based on the crude 24-week pooled data, there were 251 exacerbations among the 371 placebo recipients and 152 exacerbations among the 369 recipients of lumacaftor-ivacaftor 400–250 mg q12h, giving a difference of 99 exacerbations over 24 weeks (the small difference in the number of participants in each group is immaterial for this consideration).2 Crude data from table 15 of the clinical briefing document in the FDA documentation provide the actual number of patients who had a pulmonary exacerbation over the 24-week period of the study.3 If pooling these raw 24-week data, 161 of 371 patients (43.40%) in the placebo group had an exacerbation, and 109 of 369 patients (29.54%) in the lumacaftor-ivacaftor 400–250 mg q12h group had an exacerbation, yielding a crude ARR of 13.86% and a crude NNTB of 7.22 (conventionally rounded up to 8) over 24 weeks. The Kaplan-Meier curve in figure 2A of the NEJM article corroborates these findings.2
The testing hierarchy specified for the individual trials was not applied to the pooled data, there is no implicit reason for this approach, and no explanation for this decision is offered.2 ,4 If maintaining the testing hierarchy paradigm, pulmonary exacerbations would still be considered non-significant for lumacaftor-ivacaftor 400–250 mg q12h due to failure higher in the testing hierarchy at assessment of the pooled CFQ-R data.
It would be ideal for pulmonary exacerbations to be further studied, preferably as a primary outcome, and also including patients with more severe CF. However, if we are not fortunate enough to see further trials, it does still appear lumacaftor-ivacaftor has therapeutic potential in reducing pulmonary exacerbations, especially when one considers other methods to correct for multiple testing of the pooled data (including the rather conservative Bonferroni method, the Šidàk-Bonferroni method, the Hochberg method and the Benjamini-Hochberg method) would all still find the pooled pulmonary exacerbation data for the 400–250 mg q12h dose of lumacaftor-ivacaftor significant. Additionally, in the absence of data on more severe CF, the aforementioned rate ratio application can be reasonably used in patients with more severe CF, as relative metrics are generally considered to be ‘transferable’ (whereas absolute metrics are not). Still, this does not materially alter the question of whether lumacaftor-ivacaftor is a bona fide breakthrough that justifies a price of $259 000 per year.
The safety data for lumacaftor-ivacaftor are based on the pooled data, and compared to placebo, more patients receiving the 400–250 mg q12h dose of lumacaftor-ivacaftor: discontinued the study medication due to a side effect (absolute risk increase (ARI), 3.0%; number needed to treat to harm one person (NNH, NNTH), 33.33, conventionally rounded down to 33), experienced dyspnoea (ARI, 5.2%; NNTH, 19), developed an upper respiratory tract infection (ARI, 4.6%; NNTH, 21), experienced nausea (ARI, 4.9%; NNTH, 20) and had serious adverse events related to abnormal liver function (ARI, 1.9%; NNTH, 52; liver function test levels returned to normal for all but one patient, who seroconverted for hepatitis E during the study period).2 Chest tightness was also somewhat elevated (ARI, 2.8%), but this adverse effect does not reach conventional statistical significance via χ2 analysis or Fisher's exact test.2
Other CF therapies and cost of CF therapy
To provide context, other therapies for CF have also provided improvements in mean baseline ppFEV1 that are either comparable, marginally to somewhat better, or notably better than that offered by lumacaftor-ivacaftor (inhaled hypertonic saline, 3.2% improvement; dornase alfa, 5.6%-5.8% improvement; azithromycin, 6.2% improvement; inhaled tobramycin, 11.9% improvement).9–12 Likewise, therapies have also shown an ability to reduce pulmonary exacerbations (inhaled hypertonic saline, relative risk of 0.44; dornase alfa, relative risk of 0.72 after a post hoc adjustment for age; azithromycin, hazard ratio of 0.65) or hospitalisations and the need for intravenous antibiotics (inhaled tobramycin, relative risks of 0.74 and 0.64 for hospitalisation and need for intravenous antibiotics, respectively).9–12 This is noteworthy not simply to make note of other available therapies, but also to encourage providers to make note of the utilisation rates for these therapies in TRAFFIC and TRANSPORT.2
A recent review of studies that evaluated various costs associated with CF provides even more context.13 Even though some of the therapies just described are considerably expensive (eg, dornase alfa and tobramycin each cost approximately $10 000 to $11 000 per year13), these costs still pale in comparison to the cost of lumacaftor-ivacaftor. Likewise, to the extent that lumacaftor-ivacaftor reduces pulmonary exacerbations (which can lead to hospitalisations), it helps to know the average cost of an inpatient stay is approximately $979 per day,13 but again, reducing exacerbations and/or hospitalisations is not exclusive to lumacaftor-ivacaftor.9–12 Finally, considering the total annual cost of care for patients with CF, the review derived annual estimates of approximately $16 350 (mean) and $6545 (median), but the review found annual cost varied considerably with disease severity: in the mildest disease category in this review, annual costs were approximately $10 659 (mean) and $4548 (median), whereas in the most severe category, annual costs were approximately $40 262 (mean) and $24 061 (median).13 To consider two additional studies not included in this review, overall annual costs ranged from a mean of $24 668 to $53 264 and a median of $17 408 to $33 785.14 ,15 For the sake of argument, even the highest reported mean or median total annual cost of care is still only a fraction of the annual cost of lumacaftor-ivacaftor alone. (All costs have been transformed to the most current US$ available (2012) using the methodology described in van Gool and colleagues’ analysis).13
Put plainly, the current pricing of lumacaftor-ivacaftor is problematic and off-putting, especially in light of the data behind it and the availability of other therapies, even if other therapies have a less precise mechanism of action. Indeed, others have also noted the cost seeming unreasonably and unbearably high, particularly in light of what the data show.16 ,17
Desire and excitement for novel therapies with more targeted mechanisms of action must not influence interpretation of research, and other treatments for CF exist that are far less costly. Cautious and objective appraisal and translation of research is always needed; however, since patients and their families are likely to be rather eager about lumacaftor-ivacaftor (especially with media coverage sometimes giving exaggerated accounts7 ,18), this need is even greater, particularly with a price of $259 000 per year. Vertex officials suggest this price is necessary due to Vertex's reported research and development expenditures and the small number of patients for whom Vertex provides medications; at the same time, however, Vertex's market value is approximately $30 billion, and lumacaftor-ivacaftor is slated to help 12 Vertex senior executives secure over $53 million in one-time bonuses if Vertex is profitable over the next four quarters.7
CF is undoubtedly a sad and trying condition, as patients, patients’ friends and families, and providers can readily attest. Thus, therapeutic advancements are always of interest, especially ones that more precisely target the underlying pathophysiology of CF. However, developing a therapy with a more precise mechanism of action is only the first step; clinical trials provide crucial information about the actual clinical effects of any therapy (regardless of any reasonably hypothesised therapeutic potential). Based on the clinical trial data, whether lumacaftor-ivacaftor is truly a ‘breakthrough’ worth $259 000 a year remains rather debatable, and eligible patients deserve to have a transparent and understandable discussion about lumacaftor-ivacaftor before considering this new therapeutic option.
Correction notice This article has been corrected since it was published Online First. In sentence “(eg, dornase alpha and tobramycin each cost approximately $10 000 to $11 000 per year13)”, “alpha” should read “alfa”.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.