Original ArticleThe performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed
Introduction
The validity of a systematic review depends on minimizing bias in the identification of studies. If the studies that are included in a review have results that systematically differ from relevant studies that are missed, then the findings will be compromised by publication bias [1], [2]. Systematic reviewers are therefore advised to use comprehensive searches to attempt to locate all relevant studies [3], [4], [5].
In stark contrast to the substantial literature and empirical evidence available for randomized controlled trials [1], [6], [7], [8], [9], [10], [11], there has been little research into the determinants, magnitude, and impact of publication bias for studies of diagnostic test accuracy. Recently, funnel plot analyses developed for investigating publication bias in randomized trials have been recommended [12] and used for reviews of test accuracy [13], [14]. Evidence that the performance of these tests deteriorates as odds ratios increase raises concern that they may not be appropriate [15], [16], [17].
Determinants of publication bias are likely to be different for investigations of test accuracy. The analysis of a study of test accuracy typically involves computation of estimates of sensitivity and specificity (or possibly likelihood ratios), together with 95% confidence intervals [18]. In contrast to reporting of randomized trials, there is no stated null hypothesis or computation of an associated P-value. Thus, publication bias is unlikely to be associated with statistical nonsignificance.
Funnel plots can detect any effect that is related to sample size. Publication bias is the most commonly cited sample-size-related effect, but other factors such as study quality or the type of population may also be related to sample size. Here we explore theoretical issues that underpin the investigation of any sample size effect for diagnostic tests and develop funnel plots that are appropriate for reviews of test accuracy. Section 2 reviews existing tests for funnel plot asymmetry and considers how their performance is likely to be affected by characteristics typical of studies of test accuracy. Section 3 introduces a new funnel plot and tests for asymmetry that we apply, together with existing tests, to a case study in section 4. Through simulation, described in sections 5 and 6, we evaluate the performance of new and existing funnel plot–based tests for detecting publication bias, and estimate the impact of publication bias on estimates of diagnostic accuracy. We base our investigations on the assumption that the probability of publication decreases with lower values of diagnostic accuracy, and investigate the impact of four possible selective publication mechanisms.
Section snippets
Detection of publication bias and other sample size effects using funnel plots
The funnel plot has been recommended as a graphical device for investigating the possibility of publication bias or other sample size effects for reviews of randomized controlled trials [19]. By plotting estimates of study findings, usually the log odds ratio (lnOR), against their sample size or precision (estimated by the reciprocal of the standard error), indirect evidence for bias can be discerned from the shape of the plot. In the absence of a sample size effect, the points will form a
A robust funnel plot and test for asymmetry suitable for use with meta-analyses of diagnostic test accuracy
A funnel plot for studies of diagnostic test accuracy should not display asymmetry if variation in the magnitude of the DOR is due solely to sampling error and/or there is variation in test thresholds. In Appendix A, we show that the SE of the lnDOR does not fulfill these criteria. The only term to behave appropriately was the sample size dependent term,or (1/n1 + 1/n2)1/2, which is equal to 2/ESS1/2. Consequently, we propose that funnel plots for diagnostic test accuracy plot the lnDOR
Case study
Kearon et al. [28] reviewed the diagnostic accuracy of noninvasive tests for detecting deep vein thrombosis. They located 14 suitable studies comparing venous ultrasonography in asymptomatic patients with venography (the reference standard).
Three alternative funnel plots are presented in Fig. 1 plotting lnDOR against (a) the standard error of the lnDOR, (b) the total sample size, and (c) the inverse of the square root of the effective sample size. For computation of the standard error, addition
Evaluation by simulation
Because one of the best-known sample size effects is publication bias, we evaluated the performance of existing tests and the proposed new tests for sample size related effects through simulating meta-analyses of diagnostic tests with and without publication bias.
Simulations were undertaken in Stata version 8 (StataCorp, College Station, TX, USA). Each data set contained results from 20 studies (k = 20). Sample sizes were redefined for each simulation and varied between n = 20 and n = 2,000 (randomly
Type I error rates
Empirical type I error rates for the base scenario and a selection of parameter combinations are shown in Fig. 4. In the base scenario with a DOR of one, a diagnostic threshold set so that sensitivity = specificity, and with equal numbers of diseased and nondiseased (Fig. 4, row 1, column 1), all tests achieve empirical type I error rates close to the nominal 2.5% and 5% values in both tails, although rates for the rank correlation tests B(SE) and B/D(ESS) are a little low. The percentage
Discussion
We found that a funnel plot can be used to identify a sample size related effect such as that caused by publication bias in reviews of diagnostic test accuracy. The Begg, Egger, and Macaskill tests of funnel plot asymmetry used for RCTs are, however, likely to be seriously misleading if applied in typical diagnostic test scenarios. DORs usually take values well above one, test thresholds often preferentially favor sensitivity over specificity (or vice versa), there are usually fewer diseased
Acknowledgments
We are grateful to Fujian Song for providing datasets from his study [12], and to Patrick Bossuyt for comments on a previous draft. The work was supported by National Health and Medical Research Council (NHMRC) grant 211205 to the Screening and Test Evaluation Program. Jon Deeks is supported by a U.K. Department of Health Senior Research Fellowship in Evidence Synthesis.
References (32)
- et al.
Accuracy of outpatient endometrial biopsy in the diagnosis of endometrial cancer: a systematic quantitative review
BJOG
(2002) - et al.
Publication and related bias in meta-analysis: power of statistical tests and prevalence in the literature
J Clin Epidemiol
(2000) - et al.
Funnel plots for detecting bias in meta-analysis: guidelines on choice of axis
J Clin Epidemiol
(2001) - et al.
Misleading funnel plot for detection of bias in meta-analysis
J Clin Epidemiol
(2000) - et al.
Publication bias: a problem in interpreting medical data
J R Stat Soc A
(1988) Publication bias
- et al.
Identifying relevant studies for systematic reviews
BMJ
(1994) - et al.
How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study
Health Technol Assess
(2003)
Publication and related biases
Health Technol Assess
How important is publication bias? A synthesis of available data
AIDS Educ Prev
The existence of publication bias and risk factors for its occurrence
JAMA
Effect of the statistical significance of results on the time to completion and publication of randomized efficacy trials
JAMA
Publication bias: evidence of delayed publication in a cohort study of clinical research projects
BMJ
Asymmetric funnel plots and publication bias in meta-analyses of diagnostic accuracy
Int J Epidemiol
Cited by (0)
- 1
Present address: Centre for Statistics in Medicine, Wolfson College Annex, Linton Road, Oxford OX2 6UD, UK.