Table 4

Rating of concepts by Delphi respondents in round 1 and 2

DomainsConcepts related to risk of bias in network meta-analysisRoundIncluded
(70% consensus)
ResponsesGroup median (Q1, Q3)
Strongly disagreeDisagreeNeither agree nor disagreeAgreeStrongly agreeUnable to
Score
Network characteristics/
geometry
1: Whether all interventions in the network (including comparators) were potentially suitable for all eligible study respondents1Y0/270/272/27
(7%)
10/27
(37%)
15/27
(56%)
15
(4, 5)
2: Whether any interventions were inappropriately excluded from the network (eg, through eligibility criteria or after seeing the results)1Y0/271/27
(4%)
3/27
(11%)
11/27
(41%)
12/27
(44%)
14
(4, 5)
3: Whether importantly different intervention strategies were kept as distinct nodes in the network (ie, whether appropriate groupings were made of interventions—lumping vs splitting)1Y0/282/28
(7%)
5/28
(18%)
11/28
(39%)
9/28
(32%)
04
(3, 5)
Effect modifiers4: Whether effect-modifying participant characteristics are sufficiently similar across the whole network1Y0/280/282/28
(7%)
11/28
(39%)
15/28
(53%)
04.5
(4, 5)
5: Whether outcomes and time points are sufficiently similar across the whole network1Y0/280/284/28
(14%)
14/28
(50%)
10/28
(36%)
04
(4, 5)
6: Whether study-level risks of bias are sufficiently similar across the whole network1N0/289/28
(32%)
5/28
(18%)
8/28
(29%)
5/28
(18%)
03
(2, 4)
2N2/22
(9%)
8/22
(36%)
5/22
(23%)
5/22
(23%)
2/22
(9%)
03
(2, 4)
7: Whether other trial characteristics are sufficiently similar across the whole network1N0/244/24
(17%)
9/24
(38%)
7/24
(29%)
4/24
(17%)
43
(3, 4)
2N0/214/21
(19%)
8/21
(38%)
8/21
(38%)
1/21
(5%)
13
(3, 4)
Statistical synthesis8: Whether an appropriate prespecified approach was used in node making1Y1/25
(4%)
1/25
(4%)
5/25
(20%)
13/25
(52%)
5/25
(20%)
34
(3, 4)
9: Whether a process was used to define nodes in the network (eg, undertaken independently by two reviewers, following a preplanned node-making process)1N0/252/25
(8%)
11/25
(44%)
9/25
(36%)
3/25
(12%)
33
(3, 4)
2N1/20
(5%)
1/20
(5%)
5/20
(25%)
11/20
(55%)
2/20
(10%)
24
(3, 4)
10: Whether effect metric(s) for each outcome (eg, ORs, risk difference) in the network were presented with CIs/credible intervals1N3/28
(11%)
4/28
(14%)
2/28
(7%)
6/28
(21%)
13/28
(46%)
04
(2.25, 5)
2N3/22
(14%)
7/22
(32%)
2/22
(9%)
2/22
(9%)
8/22
(36%)
03
(2, 5)
11: If disconnected networks were connected to perform the analysis, whether methods to do this were appropriate1Y0/252/25
(8%)
4/25
(16%)
13/25
(52%)
6/25
(24%)
34
(3.5, 4)
12: Whether methods used to represent multi-arm studies in the dataset and in the analysis are appropriate1Y0/284/28
(14%)
3/28
(11%)
10/28
(36%)
11/28
(39%)
04
(3.25, 5)
13: Whether assumptions across the network about homogeneity/heterogeneity of effects within comparisons are appropriate1Y0/271/27
(4%)
6/27
(22%)
13/27
(48%)
7/27
(26%)
14
(3, 4)
14: Whether a valid approach was used to determine whether there was conflict between direct and indirect sources of evidence on the same comparisons (often called inconsistency or incoherence)1Y0/282/28
(7%)
0/288/28
(29%)
18/28
(64%)
05
(4, 5)
15: If inconsistency detected, then whether methods such as re-evaluation of the choice of scale, effect modification and similarity of the contributing randomised controlled trials were investigated1N0/284/28
(14%)
5/28
(18%)
14/28
(50%)
5/28
(18%)
04
(3, 4)
2N0/224/22
(18%)
4/22
(18%)
9/22
(41%)
5/22
(23%)
04
(3, 4)
16: If a Bayesian analysis was conducted, whether the selection of prior distributions was justified1Y0/283/28
(11%)
2/28
(7%)
13/28
(46%)
10/28
(36%)
04
(4, 5)
17: Whether the analysis appropriately addressed any differences in effect modifiers across different parts of the network1Y0/282/28
(7%)
3/28
(11%)
18/28
(64%)
5/28
(18%)
04
(4, 4)
18: Whether there was evidence of conflicting results between direct and indirect evidence (often called inconsistency and incoherence in results)1Y0/283/28
(11%)
0/2810/28
(36%)
14/28
(50%)
04
(4, 5)
19: If there were conflicting results between direct and indirect evidence was this addressed appropriately (eg, meta-regression, data extraction errors, redefining the network)1Y0/272/27
(7.4%)
1/27
(4%)
13/27
(48%)
11/27
(41%)
14
(4, 5)
20: Evidence that the statistical model, as it was used to get the key results, was not suitable for the data (eg, from analysis of residuals or information criteria such as DIC)1N0/287/28
(25%)
6/28
(21%)
9/28
(32%)
6/28
(21%)
03.5
(2.25, 4)
2N0/141/21
(5%)
9/21
(43%)
7/21
(33%)
4/21
(19%)
14
(3, 4)
21: Whether sensitivity analyses demonstrate that findings were robust to the statistical model and estimation methods (including prior distributions if Bayesian methods were used)1N0/282/28
(7%)
9/28
(32%)
8/28
(29%)
9/28
(32%)
04
(3, 4.75)
2N0/221/22
(5%)
8/22
(36%)
7/22
(32%)
6/22
(27%)
04
(3, 4.75)
22: Whether limitations at study and outcome level (eg, risk of bias), and at review level (eg, incomplete retrieval of identified research, reporting bias) were discussed1Y0/271/27
(4%)
4/27
(15%)
8/27
(30%)
13/27
(48%)
14
(4, 5)
  • *Blank responses and ‘unable to score’ responses were not counted in the denominator.

  • DIC, deviance information criterion; N, no; Y, yes.