Objectives We undertook a rapid systematic review with the aim of identifying evidence that could be used to answer the following research questions: (1) What is the clinical effectiveness of tests that detect the presence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to inform COVID-19 diagnosis? (2) What is the clinical effectiveness of tests that detect the presence of antibodies to the SARS-CoV-2 virus to inform COVID-19 diagnosis?
Design and setting Systematic review and meta-analysis of studies of diagnostic test accuracy. We systematically searched for all published evidence on the effectiveness of tests for the presence of SARS-CoV-2 virus, or antibodies to SARS-CoV-2, up to 4 May 2020, and assessed relevant studies for risks of bias using the QUADAS-2 framework.
Main outcome measures Measures of diagnostic accuracy (sensitivity, specificity, positive/negative predictive value) were the main outcomes of interest. We also included studies that reported influence of testing on subsequent patient management, and that reported virus/antibody detection rates where these facilitated comparisons of testing in different settings, different populations or using different sampling methods.
Results 38 studies on SARS-CoV-2 virus testing and 25 studies on SARS-CoV-2 antibody testing were identified. We identified high or unclear risks of bias in the majority of studies, most commonly as a result of unclear methods of patient selection and test conduct, or because of the use of a reference standard that may not definitively diagnose COVID-19. The majority were in hospital settings, in patients with confirmed or suspected COVID-19 infection. Pooled analysis of 16 studies (3818 patients) estimated a sensitivity of 87.8% (95% CI 81.5% to 92.2%) for an initial reverse-transcriptase PCR test. For antibody tests, 10 studies reported diagnostic accuracy outcomes: sensitivity ranged from 18.4% to 96.1% and specificity 88.9% to 100%. However, the lack of a true reference standard for SARS-CoV-2 diagnosis makes it challenging to assess the true diagnostic accuracy of these tests. Eighteen studies reporting different sampling methods suggest that for virus tests, the type of sample obtained/type of tissue sampled could influence test accuracy. Finally, we searched for, but did not identify, any evidence on how any test influences subsequent patient management.
Conclusions Evidence is rapidly emerging on the effectiveness of tests for COVID-19 diagnosis and management, but important uncertainties about their effectiveness and most appropriate application remain. Estimates of diagnostic accuracy should be interpreted bearing in mind the absence of a definitive reference standard to diagnose or rule out COVID-19 infection. More evidence is needed about the effectiveness of testing outside of hospital settings and in mild or asymptomatic cases. Implementation of public health strategies centred on COVID-19 testing provides opportunities to explore these important areas of research.
- evidence-based practice
- health services research
- infectious disease medicine
- respiratory tract infections
- public health
Data availability statement
Data sharing not applicable as no datasets generated and/or analysed for this study.
This article is made freely available for personal use in accordance with BMJ’s website terms and conditions for the duration of the covid-19 pandemic or until otherwise determined by BMJ. You may use, download and print the article for any lawful, non-commercial purpose (including text and data mining) provided that all copyright notices and trade marks are retained.https://bmj.com/coronavirus/usage
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
- evidence-based practice
- health services research
- infectious disease medicine
- respiratory tract infections
- public health
What is already known about this subject?
Tests for the presence of the SARS-CoV-2 virus, and antibodies to the virus, are being deployed rapidly and at scale as part of the global response to COVID-19.
At the outset of this work (March 2020), no high-quality evidence reviews on the effectiveness of SARS-CoV-2 virus or antibody tests were available.
High-quality evidence reviews are required to help decision-makers deploy and interpret these tests effectively.
What are the new findings?
Here, we synthesise evidence on the diagnostic accuracy of all known tests for SARS-CoV-2, as well as tests for antibodies to SARS-CoV-2.
We also systematically summarise evidence on the influence of tissue sample site on virus test detection rates and the influence of test timing relative to disease course on antibody detection. The results suggest that both these factors could influence test results.
We conclude that evidence on SARS-CoV-2 virus and antibody tests is nascent and significant uncertainties remain in the evidence base regarding their clinical and public health application. We also note that potential risks of bias exist within many of the available studies.
How might it impact on clinical practice in the foreseeable future?
In a rapidly developing pandemic, the widespread use of testing is an essential element in the development of effective public health strategies, but it is important to acknowledge the gaps and limitations that exist in the current evidence base and that, where possible, these should be addressed in future studies.
In particular, more evidence is needed on the performance of point-of-care or near-patient tests compared with their laboratory equivalents, and results of testing in people with no or minimal symptoms in community-based settings need further analysis.
In December 2019, a novel coronavirus was discovered in Wuhan, China, which has since spread rapidly across the world. This virus was named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and the disease that it causes, COVID-19. Early on in the pandemic, the World Health Organization (WHO stated that testing for the virus should be considered for symptomatic patients on the basis of the suspicion and likelihood of COVID-19, as well as in those who are asymptomatic or minimally symptomatic but who have been in contact with confirmed cases.1 More recently, WHO highlighted the importance of testing for disease surveillance, to limit the spread of the disease and to manage COVID-19 risk during attempts to restore normal economic and social functioning.2 Furthermore, the Organization for Economic Co-operation and Development have identified the potential importance of testing when combined with effective contact tracing in suppressing local outbreaks of COVID-19 as well as in determining individuals who have been previously infected who may safely re-integrate into work and healthcare environments.3
Tests for COVID-19 fall into two broad groups: tests that detect the presence of SARS-CoV-2 virus and tests that detect the presence of antibodies to SARS-CoV-2. Tests for the presence of virus usually use methods that recognise and amplify SARS-CoV-2 viral nucleic acid, such as reverse-transcriptase polymerase chain reaction (RT-PCR) or isothermal amplification. SARS CoV-2 virus testing is usually done in a specialised laboratory setting using respiratory samples, such as nasopharyngeal swabs, but near-patient tests have also been developed. SARS CoV-2 antibody testing (also called serology testing) is done on blood or serum samples and tests have been developed both for analysis in a laboratory and a near-patient setting. Since antibodies are produced as part of the body’s immune response to infection, serology tests may be useful to identify ongoing, recovering (convalescent) or previous SARS-CoV-2 infection.
The validation and application of the different tests for COVID-19, whether for individual clinical decision-making or population-based public health strategies, is dependent on the accuracy and performance of these tests. The purpose of this review is to identify, appraise and summarise the published evidence on the diagnostic performance and effectiveness of SARS-CoV-2 virus and antibody tests in the diagnosis and management of current or previous COVID-19. The review also explores the influence of a range of factors on test outcomes, such as the timing of testing relative to first diagnosis/symptom onset, sampling methods, and whether testing is laboratory based or done at point of care.
We systematically searched for evidence to answer the following questions:
What is the clinical effectiveness of tests that detect the presence of the SARS-CoV-2 virus to inform COVID-19 diagnosis?
What is the clinical effectiveness of tests that detect the presence of antibodies to the SARS-CoV-2 virus to inform COVID-19 diagnosis?
Searching and screening for both questions was undertaken based on one search strategy. Initial scoping-level evidence searches were conducted using online databases set up to aggregate COVID-19–specific evidence.4–6
Based on the results of these, a specific search strategy (online supplementary appendix 1; developed and run by JW) was used to capture published evidence on SARS-CoV-2 diagnostics. The databases searched were Medline, Embase, Cochrane Library, International Network of Agencies for Health Technology Assessment (INAHTA) database and Open Grey, to include all evidence published up to 4 May 2020. The sources included in the Health Technology Wales COVID-19 Evidence Digest 7 were hand-searched for relevant evidence and key stakeholders in Wales contacted for any published or unpublished data of relevance to this review. Because this was a rapid review, the protocol was not prospectively published.
Articles were included that studied any test to detect the presence of SARS-CoV-2, or antibodies to SARS-CoV-2, in people suspected of having recent or ongoing infection, and reported detection rates, influence of test result on changes in patient management or diagnostic accuracy. For the latter outcome, we included studies that used any suitable reference standard method of diagnosis (we excluded studies that used CT scan results alone as a reference standard). The detailed criteria used to select evidence are provided in online supplemental appendix 1. The following data were extracted from all studies deemed relevant: study design; number of centres and their location(s); dates of enrolment; inclusion/exclusion criteria; number of patients included; age and sex of included patients; test type; test target; test supplier or manufacturer; reference standard; outcome data for each relevant outcome reported. We used the QUADAS-2 tool to assess risk of bias and applicability of relevant articles.8 Two authors (DJ and LE) screened studies, extracted data and carried out QUADAS-2 assessments; results were checked by a third author (KC) and any disagreements were resolved by consensus.
Meta-analysis of diagnostic accuracy outcomes (sensitivity and/or specificity) was conducted only for suitable studies that reported numbers of true and false-positive and false-negative results validated against a suitable reference standard. Pooled estimates were calculated for diagnostic accuracy outcomes using a random-effects bivariate binomial model in MetaDTA V.1.25.9
Figure 1 summarises articles included and excluded at each stage, and reasons that studies were excluded. A total of 13 677 unique articles were screened for eligibility, of which 13 285 were excluded after reading the title and abstract because they did not meet our inclusion criteria. The full text of the remaining 392 articles were read and checked for eligibility, and a further 329 were excluded. Of the remaining 63 relevant articles, 38 studied virus tests and 25 studied antibody tests. Tables 1 and 2 summarise the design and characteristics of studies reporting diagnostic accuracy outcomes for virus and antibody tests, respectively. Characteristics of studies that reported other outcomes are reported in online supplemental appendix 2.
All the articles that reported on virus detection were based on the detection of amplified viral SARS-CoV-2 nucleic acid sequences. Most studies used laboratory-based RT-PCR tests conducted using standard in-house or commercially available reagents and equipment, although in some cases, assay details were not reported. The RT-PCR primer used (ie, which part of the viral RNA is targeted and amplified) varied between studies, although again in some cases, primer details were not reported. In addition to RT-PCR, we identified five studies reporting the diagnostic performance of isothermal amplification assays.
The antibody tests studied used a range of different assay methods to detect one or more antibody type (different immunoglobulin classes and/or antibody targeted). In seven of the studies, tests were laboratory based (ELISA).10–16 We identified 17 studies using assays (lateral-flow immunoassay (LFIA); chemiluminescent immunoassay (CLIA); colloidal gold immunochromatographic assay (GICA)) that could be suitable for point-of-care use,13 17–32 but the tests were not applied at point of care, or it was not clearly reported that the test had been applied at point of care, in 14 of these studies. In two studies, the type of assay was unclear.33 34
The reliability and applicability of each study’s conduct and reporting was assessed using the QUADAS-2 tool.8 Virus tests and antibody tests were assessed separately and summary judgements are shown in figure 2; signalling questions used and judgements per study are shown in online supplemental appendix 3.
For virus tests, most studies were judged to be of high or unclear risk of bias regarding patient selection, either because patients were selected for the study in a way that could have introduced bias (11% of studies) or because the method of patient selection was unclear (56% of studies). Risk of bias regarding how the index test was conducted or interpreted was judged to be high or unclear for 14% and 44% of studies, respectively, either because aspects of how the tests were conducted were unclear or because tests were not conducted in a uniform manner. For the 12 studies that included a reference standard, we judged the risk of bias to be unclear in 42% and to be high in 8%, largely because not all tests were compared against a uniform reference standard or some details of the reference standard were uncertain.
For antibody tests, the method of patient selection was judged to be unclear for 36% of studies and high for a further 52%. There was an unclear and high risk of bias regarding how the index test was conducted or interpreted in 72% and 12% of studies, respectively. For the 19 studies that included a reference standard, 42% were judged to have an unclear risk of bias and 16% a high risk of bias.
We identified one existing published meta-analysis estimating the sensitivity of an initial RT-PCR test, using the results of repeated RT-PCR tests as the reference standard.35 This included studies published up to 3 April 2020, and aligned closely with our inclusion criteria regarding virus tests, although the authors included studies of any population size, whereas we elected to exclude studies that included less than 10 patients, meaning we omitted seven studies (total 46 patients) that were included in the earlier meta-analysis. We used data from this analysis and studies published subsequently to determine that the overall sensitivity of RT-PCR is 87.8% (95% CI 81.5% to 92.2%), based on 16 studies of 3818 patients. Because these studies only included cases where COVID-19 was confirmed to be present, specificity cannot be legitimately estimated.
Five studies (972 patients or samples in total) reported the diagnostic accuracy of isothermal amplification assays in the diagnosis of patients with suspected COVID-19, using test results from RT-PCR as a reference standard.36–40 Because the use of a single RT-PCR test as a reference standard may not be representative of true disease presence, we deemed it inappropriate to use the results of these studies to derive a single pooled estimate of sensitivity and specificity. Reported diagnostic sensitivity and specificity estimates range from 74.7% to 100% and 87.7% to 100%, respectively. Table 3 provides a detailed breakdown of results.
Ten studies on antibody testing (757 participants included; number not clear for two studies) reported sensitivity and specificity11 12 17 18 20–24 33 or sufficient information to allow these to be calculated. Two additional studies reported specificity only.10 19 Where a reference standard was included, this was usually RT-PCR (initial and repeats until a positive confirmation); one study that used either RT-PCR or clinical diagnosis to determine final disease status. Furthermore, studies used a range of different antibody types and targets. Because of these limitations, we concluded that pooling data across studies was not appropriate.
The reported sensitivity in these studies ranged from 18.4% to 96.1%. Notably, the lowest reported sensitivity was using a point-of-care test,17 although sensitivity figures below 50% were also reported for one laboratory test.20 Specificity was reported in 12 studies (682 participants included; number not clear for two studies) and ranged from 88.9% to 100%. Full outcomes from these studies are shown in table 4.
The positive predictive value (PPV) and negative predictive value (NPV) of RT-PCR was estimated at different prevalence levels. We used our pooled sensitivity estimate of 87.8%, and because we were unable to calculate specificity using the evidence found, we used a previously published estimate of 98.0% for specificity.41 Prevalence estimates were based on data from Public Health England (PHE).42 A prevalence rate of 3.0% was estimated based on PHE data up to 6 August 2020, which showed that there have been 308 134 confirmed cases of COVID-19 from 10 236 970 tests. At this prevalence level, RT-PCR testing was estimated to have a PPV of 57.7% and NPV of 99.6%. To estimate the utility of the test at times of high prevalence, PPV and NPV were also estimated using PHE data up to 1 May 2020 (the date at which the daily number of cases was at its highest point). On this date, a prevalence rate of 24.6% was estimated based on 177 454 confirmed cases of COVID-19 from 721 124 tests. At this prevalence level, RT-PCR testing was estimated to have a PPV of 93.5% and NPV of 96.1%.
We identified 18 studies30 43–59 that compared RT-PCR for SARS-CoV-2 results from samples taken from different parts of the body. These results are presented in table 5. Most samples were taken from the upper respiratory tract (online supplemental appendix 4 summarises detection rates for individual sites in the upper respiratory tract, where reported). Other sample sites were saliva, sputum and stool/rectal swab. The detection rates varied across sample sites but the heterogeneous nature of the studies makes meaningful comparison difficult. Detection rates were consistently low with urine or tears/conjunctiva sampling. Detection rates in blood samples were mixed, with some studies reporting very low detection rates, while others reported rates that were comparable with samples from other sites in the same population.
The majority of studies tested people with relatively severe disease and a high suspicion of COVID-19 infection. Considering other populations, two studies60 61 tested UK healthcare workers and three studies62–64 tested people outside of hospital (or as outpatients). One further study65 routinely tested pregnant women. All these studies only reported detection rates: results are summarised in online supplemental appendix 4.
Ten studies provided data on antibody detection (seroprevalence) at different points in time after the onset of confirmed COVID-19 disease.10 12–15 26–28 31 34 Detailed results are shown in online supplemental appendix 4.
This review summarises the available published evidence of the effectiveness of tests that are used in the diagnosis of current or previous COVID-19 infection up to 4 May 2020. Despite this work taking place relatively early in the COVID-19 pandemic, 38 published studies were identified that reported on the effectiveness of tests for detecting the presence of SARS CoV-2 virus and 25 studies were identified that reported on testing for the presence of antibodies. Analysis of these studies using the QUADAS-2 framework revealed high or unclear risks of bias in the majority, most commonly as a result of unclear methods of patient selection and test conduct, or because of the use of a reference standard that may not definitively diagnose COVID-19. Nonetheless, the available evidence provides information on which to begin to judge the possible clinical effectiveness of COVID-19 testing, although significant uncertainties remain in the evidence base regarding their clinical and public health application.
In the course of our work, the first meta-analysis of diagnostic accuracy for SARS-CoV-2 virus tests was published by Kim et al.35 This included pooled analysis of 19 studies (1502 patients) and used the results of repeated laboratory-based RT-PCR as the reference standard. This aligned closely with our own inclusion criteria for virus tests, although Kim et al included studies of any population size, we excluded studies with less than 10 patients, meaning we omitted 7 studies (46 patients). However, by including more recently published studies in our analysis the number of patients more than doubles from 1502 to 3818 patients. The analysis by Kim et al estimated that the sensitivity of an initial RT-PCR test is 89% (95% CI 81% to 94%). Our addition of data from more recent studies leads us to conclude a sensitivity of 87.8% (95% CI 81.5% to 92.2%). An analysis estimating the PPV and NPV of RT-PCR showed that the NPV is likely to be high while PPV may be low at times where the prevalence in the tested population is low. The likely prevalence in the tested population should therefore be a key consideration for decision-makers when interpreting test results and deciding on testing strategies. Despite our finding of a high NPV for RT-PCR, uncertainty may remain with a negative test result, especially in the context of high clinical suspicion, and the possibility of a false-negative result also needs to be considered. Possible causes for false-negative tests include laboratory error, sampling error, and variability in viral shedding with the lack or negligible presence of virus nucleic acid in the tissue sampled at the time of sampling. Determining the specificity of SARS-CoV-2 nucleic acid testing is particularly challenging because of the inclusion in the published studies of patients considered to be suffering from COVID-19 as well as the lack of a reference standard that validates the absence of disease. The assessment of overall diagnostic accuracy in laboratory testing for the presence of the SARS-CoV-2 virus is hampered by the absence of a definitive reference standard and by a wide range of target primers, methods and types of sampling used in the published studies. In addition, there is very limited published information on the diagnostic accuracy of point-of-care or near-patient tests.
Of the 25 studies that assessed antibody tests, 10 reported diagnostic accuracy in terms of both sensitivity and specificity, almost all using RT-PCR (initial or repeat testing) as the reference standard.11 12 17 18 20–22 24 33 62 Accepting the limitations already discussed around the absence of a diagnostic reference standard, the overall sensitivity reported in these studies varied widely, from 18.4% to 96.1%, although the specificity was more consistent and ranged from 88.9% to 100%. The clinical implications of these data are that considerable uncertainty remains about the implications of a negative antibody test with a significant possibility of false negativity, while the presence of a positive antibody test carries with it a high likelihood of previous COVID-19 infection. There is very limited information available on the accuracy of point-of-care antibody tests.
Our study has some limitations, primarily due to the nature of the evidence found by our searches. The rapid nature of this work (to help inform decision-makers at the outset of the COVID-19 pandemic in the UK) meant some steps in a full systematic review were not completed: there was minimal consultation with decision-makers on the inclusion and exclusion criteria for the review, and we did not publish our protocol in advance of commencing the review. Other limitations relate to the nature of the evidence we found, and that this work was completed during the early stages of the COVID-19 pandemic. The lack of a recognised reference standard meant we considered studies for inclusion that used any appropriate method to verify test results. While initial suspicion of COVID-19 may be based on clinical assessment combined with radiological results, WHO advice is that laboratory-based nucleic acid testing (such as RT-PCR) should be used to confirm cases with further confirmation by nucleic acid sequencing when necessary or feasible.66 The only suitable studies that allowed the diagnostic accuracy of RT-PCR to be assessed compared initial test results with repeated RT-PCR testing in the same individuals: this allowed us to estimate the sensitivity of an initial RT-PCR test, using final (positive) results of the repeated test as the reference standard. Use of this reference standard, which only validates the presence of disease and not its absence, means specificity cannot be determined. We estimated the PPV and NPV of RT-PCR and at different prevalence rates; estimates from PHE were judged to provide the best current evidence for prevalence. However, it should be noted that there are limitations with this approach. Most notably, it is based on the total number of tests rather than the number of people tested. As such, the estimates may underestimate prevalence as many people will have been tested more than once. Crucially, because we could not calculate specificity from the evidence found by our own systematic review, we relied on a previously published estimate of 98.0%. PPV is highly sensitive to this estimate, emphasising the need for further reliable published estimates of the sensitivity of RT-PCR to the interpretation of this test, particularly in low prevalence populations. Furthermore, the evidence included in this pooled analysis and other individual studies we identified used a range of target primers, methods and type of sampling.
We observed similar limitations with evidence on other tests. We found studies reporting diagnostic accuracy of antibody tests and of loop-mediated isothermal amplification (LAMP) as a method of virus detection. However, the reference standard used was RT-PCR (initial and repeat tests), except for one study that used either RT-PCR or clinical diagnosis to determine final disease status. As already concluded, a true assessment of the accuracy of RT-PCR test results is very challenging, and using RT-PCR for validation means the same limitations apply to the results of any antibody or LAMP tests studied in this way. These tests also varied considerably in their conduct and protocols used. These limitations led us to conclude that it was inappropriate to conduct pooled analysis of diagnostic accuracy, meaning limited conclusions about antibody and LAMP tests can be drawn based on the data currently available. Lastly, for all types of test, there are a wide range of different commercially available testing products and kits, as well as some that use protocols developed in-house by academic and public health testing laboratories. Where available, we have detailed the exact test used for each data source (tables 1 and 2, and online supplemental appendix 2), but our evidence synthesis does not take into account similarities or differences between specific test kits or protocols, and the results should be interpreted with this in mind.
Alternative approaches to validating COVID-19 test results could use genomic sequencing, testing for multiple primer targets, confirmatory testing for other respiratory viruses, long-term follow-up, or clinical signs and symptoms. Each would have potential advantages and disadvantages. For example, genomic sequencing would determine the exact strain of virus present and could detect cases of infection where primer target regions were not conserved, resulting in a false-negative RT-PCR result. However, sequencing could only be used to ‘rule in’ the presence of the virus and not to rule it out. Furthermore, it is time and resource intensive and so would be highly unlikely to be undertaken for all samples in routine practice. Future studies might need to use a combination of these factors as a composite reference standard that could validate results from both positive and negative diagnoses, allowing sensitivity and specificity of the test to estimated with greater certainty.
In applying the results of the published studies on testing for COVID-19 to influencing the development of evidence-based testing strategies, it should be noted that the majority of published studies reported on the results of COVID-19 virus or antibody testing were done in a hospital setting and in symptomatic patients with confirmed or suspected COVID-19 infection. Data on testing in other settings are comparatively limited. Only three studies62–64 were identified that used RT-PCR to detect SARS-CoV-2 in the general population in the context of mild influenza-like symptoms while only two studies were found60 61 that reported on the testing of UK healthcare workers. Furthermore, only one study18 was identified that reported on the results of antibody testing outside of a hospital setting. In a rapidly developing pandemic, the widespread use of testing is an essential element in the development of effective public health strategies, but it is important to acknowledge the gaps that exist in the current evidence base and that, where possible, these should be addressed in future studies.
In regard to future research, more data are required to substantiate the effectiveness of tests to detect the presence of SARS-CoV-2 virus or antibodies to SAS-CoV-2 in different populations, and more evidence is needed to compare the effectiveness of laboratory-based testing and point-of-care testing strategies. Further clarity is required about the optimal timing of tests relative to symptom onset. The results of testing in people with no or minimal symptoms in a community-based setting needs further analysis and the impact of these data on public health measures needs to be fully analysed. Evidence should be prospectively collected during the implementation of public health strategies that combine testing with tracing and isolating individuals who have been in contact with COVID-19 sufferers. The UK National Institute for Health and Care Excellence have recently published an evidence standards framework which describes a three-stage approach to collecting the best possible data and evidence in the short and long term which is applicable both to established and developing COVID-19 tests.67 The framework assumes that the tests’ analytical performance is already established, that developers are complying with existing quality systems for manufacturers (ISO 13485) and laboratories (ISO 15198 or 17025), and recommends that a good diagnostic accuracy study should be followed by demonstrating the clinical significance as well as the economic impact of applying the index test.
Data availability statement
Data sharing not applicable as no datasets generated and/or analysed for this study.
Patient consent for publication
Contributors DJ, MP, SM and PG conceived the project and prepared the review inclusion and exclusion criteria. JW prepared and ran search strategies with input from DJ. DJ and LE screened evidence, extracted data from relevant studies and carried out risk of bias assessments; data were independently checked by KC. DJ and MP analysed data to generate outcomes. DJ, LE, JW and PG wrote the manuscript with input and editing from all other authors.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.