Elsevier

Journal of Clinical Epidemiology

Volume 141, January 2022, Pages 99-105
Journal of Clinical Epidemiology

Original Article
Reliability of the revised Cochrane risk-of-bias tool for randomised trials (RoB2) improved with the use of implementation instruction

https://doi.org/10.1016/j.jclinepi.2021.09.021Get rights and content

Abstract

Objective

to assess the inter-rater reliability (IRR) of the revised Cochrane risk-of-bias tool for randomised trials (RoB2).

Methods

Four raters independently applied RoB2 on critical and important outcomes of individually randomized parallel-group trials (RCTs) included in the Cochrane Review “Cannabis and cannabinoids for people with multiple sclerosis.” We calculated Fleiss’ Kappa for multiple raters and time to complete the tool; we performed a calibration exercise on five studies, then we developed an implementation document (ID) specific for the condition, and the intervention addressed by the review with instructions on how to answer the signalling questions of RoB2 tool. We measured IRR before and after the ID adoption

Results

Eighty results related to seven outcomes from 16 RCTs were assessed. During calibration exercise we reached no agreement for overall judgment (IRR -0.15); IRR for individual domains ranged from no agreement to fair. Mean time to apply the tool was 168.5 minutes per study. Time to complete the calibration exercise and develop the ID was about 40 hours. After the ID adoption ID, overall agreement increased to slightly (IRR 0.11) for the first five studies and moderate (IRR 0.42) for the remaining 11. IRR for individual domains ranged from no agreement to almost perfect. Mean time to apply the tool decreased to 41 minutes.

Conclusion

RoB2 tool is comprehensive but complex even for high experienced raters. The development of an ID specific for the review may improve reliability substantially.

Introduction

The evaluation of risk of bias (RoB) of studies included in systematic reviews (SRs) is of paramount importance to document the potential flaws and the internal validity of the review's results and necessary to assess the certainty of the evidence [1]. In February 2008, the Cochrane Collaboration published a new tool [2,3] to assess RoB of randomised trials (RCTs), that has been widely used even in the non-Cochrane SRs [4].

In 2019 the Revised Cochrane risk-of-bias tool for randomised trials (RoB2) was published [5], with the aim to overcome some limitations of the original version, such as the inconsistent use of the tool, the overuse of the “unclear” judgment, and the lack of an overall judgment domain. Moreover, the new tool addressed advances in the theoretical discussion around bias (e.g., open-label studies should not automatically be judged at high risk of bias; the need to assess separately selective reporting of specific measures of a given outcome and selective non-reporting of outcomes, the need for assessing RoB at the results level) [6]. Finally, the new tool has specific versions for cross-over [7] and cluster randomised trials [8].

In 2020 a study was published assessing the inter-rater reliability (IRR) of RoB2 applied on a random sample of RCTs from PubMed and showed low reliability and challenges in its application [9]. Study authors commented that “understanding and applying the tool properly is complex and demanding and may hamper its wide adoption and correct application”; they also reported that “the review team must have proven knowledge of the subject matter, clinical epidemiology, and statistics to ensure proper application.” However, study authors underlined that their results could have poor generalizability, as RoB assessment is normally performed within a systematic review of studies on the same clinical area and review authors usually have proven knowledge of the subject matter; therefore, their results could be not be fully applicable to the real context of the tool usage.

The aim of this study was to assess IRR between reviewers of RoB2 tool when used to assess RoB of the studies included in a systematic review and explore the feasibility and utility of the development of an implementation document (ID) specific for the review context. Moreover, we explored the review burden in term of time needed to apply the tool, and to develop the ID.

Section snippets

Risk of bias assessment with RoB2

We applied RoB2 to the outcome results defined as critical or important in the protocol of the Cochrane Review “Cannabis and cannabinoids for people with multiple sclerosis [10] and reported in the individually randomized parallel-group trials that were included in the review (Table 1). A RoB2 pilot project was undertaken by Cochrane including reviews that were using RoB2 for the first time. The pilot involved web clinics where authors could ask questions and bring their concerns to the RoB2

Characteristics of the studies

The four raters independently applied the RoB2 to 80 results related to seven outcomes reported in 16 RCTs. Table 2 reports the main features of the studies. (See Appendix B for references of the included studies).

Inter-rater reliability of the calibration exercise and time needed

Agreement was fair for the “randomisation process” domain (IRR 0.30); slight for the domain “missing outcome data” (IRR 0.08) and for “selection of reported results” (IRR 0.12); no agreement was reached for “measurement of the outcome” (IRR -0.24), “deviations from the intended

Main findings

Four raters applied RoB2 to 80 results related to seven review outcomes from 16 parallel RCTs. During the calibration exercise, no agreement was reached for overall judgment; for single domains it ranged from no agreement to fair agreement. The calibration exercise, which consisted of the evaluation of the first five studies and resolution of disagreement by discussion, together with the subsequent development of an ID with detailed instruction on how to answer to SQs in the context of the

Funding sources

This study did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Authors’ contributions

Silvia Minozzi and Graziella Filippini conceived the study and planned data collection. Silvia Minozzi, Graziella Filippini, Francesca Borrelli and Kerry Dwan assessed the RCTs and applied RoB2. Kerry Dwan planned and performed the statistical analyses. All authors contributed to the development of the implementation document and to data interpretation. Silvia Minozzi and Graziella Filippini drafted the first version of the manuscript. All authors contributed to and approved the final version.

References (20)

There are more references available in the full text version of this article.

Cited by (39)

View all citing articles on Scopus

Conflicts of interest: SM, GF, FB don't have competing interest to declare. KD is employed by Cochrane.

1

Kerry Dwan, Methods Support Unit, Editorial & Methods Department, Cochrane Central Executive, Cochrane, St Albans House, 57-59 Haymarket, London

2

Francesca Borrelli, Department of Pharmacy, School of Medicine and Surgery, University of Naples Federico II., Naples, Italy.

3

Graziella Filippini MD, Cochrane Review Group on Multiple Sclerosis and Rare Diseases of the CNS, Carlo Besta Foundation and Neurological Institute, Milan, Italy

View full text