Article Text

## Statistics from Altmetric.com

## Standfirst

Healthcare practitioners and patients are often required to make choices between two treatment options which may have never been compared directly. When performing standard ‘pairwise’ meta-analysis, it is common to find trials comparing one of two experimental treatments against a common control (standard of care or placebo) but no head-to-head trials. Statistical methods are available to perform indirect comparisons between two treatments when direct evidence is lacking. Here, we provide a simple, easy-to-use, Excel spreadsheet capable of performing these calculations.

## Introduction

The statistical foundation for indirect comparisons in the setting of meta-analysis was first described by Bucher *et al*.1 They discussed the common situation where two treatments have been compared with the same control treatment but have not been compared directly to each other. Figure 1 displays this simple type of ‘network’ where a common control (treatment A) is compared with treatments B and C. The key to Bucher’s technique is that the variance (the square root of which is used to calculate the confidence interval of the indirect comparison B versus C is equal to the variance of the direct comparison A versus B plus the variance of the comparison A versus C. For ratio effect estimates, such as the OR, risk ratio (relative risk) and HR, calculations are done on a logarithmic scale.

The direct comparisons can be results from a single clinical trial or, ideally, the results of a systematic review and meta-analysis if multiple trials have compared the same two interventions in a similar clinical context.

## Key assumptions

When performing indirect comparisons, the assumption of transitivity must be met.2 For the indirect comparison to be valid, the A versus B and A versus C trials should not differ with respect to potential effect modifiers, for example, the age of the participants. Their eligibility criteria should be comparable, with the treatment regimens in the shared arm (A), and arms B and C, being comparable across all studies.

This transitivity assumption is measured by statistical tests when conducting larger network meta-analyses (NMAs); however, this is not possible with a small network containing three treatments and two direct comparisons.3 Instead, we can assess for violations in the transitivity assumption by checking that the study characteristics of the three trials are broadly similar.2 If the direct comparisons come from pairwise meta-analysis, heterogeneity in these meta-analyses might be caused by violation of the transitivity assumption. If there is no statistical heterogeneity in a meta-analysis with a non-negligible sample size from a properly conducted systematic review, it is reasonable to conclude that all the trials were attempting to measure the same effect. One should also check that the mean outcomes for a specific intervention are similar across all trials reporting on that intervention, and that adherence to assigned treatments is similar across studies (differences in either suggesting that the assumption of transitivity is violated). However, differences in the baseline risk do not necessarily mean that there will be differences in the effect size.

When performing any meta-analysis or indirect comparison, it is critical to assess the certainty of the evidence.4 There are some specific points to highlight for indirect comparisons. First, the indirect comparison is subject to all of the internal biases of the individual trials, which should be formerly assessed with risk of bias tools, and these must be considered when assessing the certainty of the indirect effect estimate.5 If the outputs of pairwise meta-analyses are used as the direct effect estimates, it is important to assess heterogeneity between individual studies contributing to these meta-analyses. Heterogeneity reduces the certainty of evidence from pairwise meta-analyses and will therefore reduce the certainty of evidence for an indirect estimate which uses these as ‘inputs’.6 Where there is heterogeneity, a random effects model should be used to pool treatment effects to avoid underestimating the uncertainty in the direct estimates.7

## Case study

This example is taken from a Cochrane review on techniques to preserve donated livers for transplantation.8 Both cold and warm machine perfusion have been compared with standard of care (‘ice-box storage’) in several randomised trials. However, there are no trials directly comparing cold and warm machine perfusion.

The authors of this review deemed there to be no important differences between trials with respect to potential effect modifiers (specifically study and participant characteristics were sufficiently similar) and there was no evidence of statistical heterogeneity in either of the pairwise meta-analyses. This suggests that all of the trials were asking the same or very similar questions with respect to the HRs they set out to estimate. As there was no evidence of violation of the transitivity assumption, the Excel tool (online supplemental materials) can be used to generate an indirect comparison.

### Supplemental material

Figure 2 displays how the output of the standard pairwise meta-analyses for graft survival are entered into the tool. The resulting indirect HR for cold versus warm machine perfusion is 0.38 (95% CI 0.11 to 1.25, p=0.11). As discussed above, it is important to assess the certainty of evidence of this result, which cannot be higher than the certainty of evidence for each of two direct comparisons. The referenced review deemed this indirect comparison low certainty evidence.

In this example, both direct comparisons are the output of a meta-analysis. However, if one of the direct comparisons only had a single trial, then the effect estimate from that single trial (rather than a meta-analysis) would be used. Many trials report a p value instead of a 95% CI and so we have included a calculator to derive CIs from p values (online supplemental material-statics worksheet) along with other resources for calculating effect estimates and CIs from other data provided in trial reports.9 10

### Supplemental material

If the data are taken from different sources, it is critical to ensure that the direction of the comparison is the same for both reported comparisons. Continuous effect measures are typically calculated as intervention minus control, and ratio statistics as intervention divided by control, but these may be calculated the other way around. For continuous statistics, the effect estimate for B versus A is the negative of the effect of A versus B. For ratio statistics, the effect estimate for B versus A is the inverse of the effect of A versus B. A worked example where the two treatment effects are calculated in different directions is provided within the Excel tool (online supplemental material; worked example 2 ‘inverted’).

Note that the level of precision for the indirect comparison (represented by the width of the CI) is less than for the direct comparisons. Despite the high-certainty evidence that cold machine perfusion is superior to standard ice-box storage, while warm machine perfusion appears no better than equivalent, we still cannot be certain that cold machine perfusion is superior to warm.

Statistically, the Bucher method is equivalent to a test for interaction, often used as a test for subgroup differences in meta-analysis. This can be seen from the identical p values reported in the spreadsheet for the indirect comparison and the test for interaction reported on the forest plot. The test for interaction has very low power compared with main effects and so requires much larger sample sizes to draw reliable conclusions.2 11 Effect estimates from meta-analyses in systematic reviews will have larger sample sizes than individual trials and provide a less-biased summary of the evidence than any single trial, and so should always be preferred if available.

## More complex networks and NMA

For the simple network displayed in figure 1, the techniques described above are effective and are recommended in the Cochrane Handbook (chapter 11).12 However, for more complex networks of treatments (those with more than two comparisons with direct evidence), more complicated statistical methods, collectively referred to as NMA, should be applied.

Such methods are seeing a surge in use, with methods implemented in Stata, a number of R packages (such as GeMTC, BUGSnet and netmeta), and user-friendly open-source applications such as MetaInsight.13–17 These techniques are recommended for more complex networks.15 For the simple network in figure 1, frequentist NMA will generate identical results to the Bucher technique we describe (the techniques are equivalent for this network). This is demonstrated for our case study in online supplemental figure 1. We also performed Bayesian NMA (BUGSnet) using the case study example, with the default uninformative priors. As expected, the central estimate is identical to our analysis and the credible intervals were a little bit wider than the frequentist CIs.

## Conclusions

In contrast to NMA methods, the techniques of Bucher *et al* are less frequently used. However, for the network shown in figure 1, Bucher indirect comparisons are recommended by Cochrane and generate identical results to frequentist NMA.12 Such networks occur frequently, where two new treatments or procedures are developed and assessed against placebo or standard of care, rather than each other, or where two different treatments are considered suitable options by clinicians and patients but the potential benefits and harms need to be weighed up carefully. Previous reports of Bucher’s methods have been written for an audience of statisticians; our aim here is to make the methods more accessible.

The Excel spreadsheet which we provide here is easy to use and should facilitate the practical application of these techniques. We envisage this tool being used by researchers performing meta-analyses where a network analogous to figure 1 is found. This should be performed under the guidance of someone with meta-analysis expertise, preferably a statistician.

## Ethics statements

### Patient consent for publication

### Ethics approval

Not applicable.

## Acknowledgments

Thank you to Kim Pearce (Newcastle University) for their statistical support which laid the initial foundations for this project. Without this, the published tool would likely never have been created.

## Supplementary materials

## Supplementary Data

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

## Footnotes

X @SamJTingle, @RacheRichardson

Contributors SJT conceived the paper. SJT and JS created the first version of the excel tool which was then adapted following testing and suggestions from GK, FL, CW and RR. SJT and JS drafted the first version of the manuscript which was changed following critical review from all other authors. All authors reviewed and approved the final version. JS is the guarantor of this work.

Funding SJT worked on this project during an MRC Clinical Research Training Fellowship (MR/Y000676/1) at Newcastle University. This study was supported by the National Institute for Health and Care Research (NIHR) Blood and Transplant Research Unit in Organ Donation and Transplantation (NIHR203332), a partnership between NHS Blood and Transplant, University of Cambridge and Newcastle University. The views expressed are those of the author(s) and not necessarily those of the NIHR, NHS Blood and Transplant or the Department of Health and Social Care.

Competing interests All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/disclosure-of-interest/ and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.

Provenance and peer review Not commissioned; internally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.