Article Text


EBM round-up: February 2011
  1. Richard Saitz
  1. Boston Medical Center, Boston University Schools of Medicine and Public Health, Boston, Massachusetts, USA
  1. Correspondence to Richard Saitz
    Section of General Internal Medicine, Boston Medical Center, Boston University Schools of Medicine and Public Health, 801 Massachusetts Avenue, 2nd floor (Crosstown 2), Boston, MA 02118–2335, USA; rsaitz{at}

Statistics from

Noun (\ˈraund-ˌəp\) 1) the act or process of collecting animals by riding around them and driving them in; 2) a summary of information.

Transitive verb 1) to collect (as cattle) by means of a roundup; 2) to gather in or bring together from various quarters.1

In this occasional feature we gather and summarise articles relevant to the practice, research and teaching of evidence-based medicine (EBM) that have been published elsewhere in the peer-reviewed medical literature. The primary criterion for selection is relevance to EBM. Articles are selected by scanning selected journal tables of contents for reports of potential relevance. The round-up writer (an EBM editor) comments on each summary.

Trials stopped early for benefit overestimate treatment effects

Commentary on:

Data safety monitoring committees are often charged with stopping controlled studies when data suggest a benefit is very likely. Such committees often base decisions on statistical significance using calculations designed specifically for this purpose. But the results of studies stopped early might not be what they would have been had the trial continued. To determine whether treatment effects differed, investigators systematically reviewed the literature to compare outcomes of treatment studies stopped earlier than initially planned due to interim results favouring the intervention, to those in meta-analyses of studies addressing the same question that were not truncated.

Investigators included 91 truncated randomised controlled trials (RCTs) and 424 non-truncated trials of the same question identified in systematic reviews addressing 63 questions. They calculated the ratio of relative risks (RRs) (in other words, the RR of bad outcome in the treatment group compared with control in truncated studies over the RR in non-truncated trials). The ratio was <1 in 55 of 63 comparisons, and the (weighted) average of the ratios was 0.71 (95% CI 0.65 to 0.77). RCTs were more likely to be published in high-impact journals (30% vs 68%). Differences between truncated and non-truncated study results were larger when truncated studies had fewer than 500 outcome events.


Whether studies missed by this systematic review would change the conclusions is not clear. But it does appear that studies stopped early overestimate treatment effects. That is important information for clinicians. The authors suggest that in addition to statistical significance, researchers may want to take number of events into account when deciding on whether to stop a trial. Ethicists will no doubt need to wrestle with the implications.

Individual participant data meta-analysis: more powerful, but more difficult, if not sometimes impossible to do

Commentary on:

Usually, meta-analyses examine aggregate data, or effects averaged across individuals in studies. However, analytic combination of individual participant data – the data recorded for each subject in a study – can have some advantages.

One of the main advantages is simply the ability to obtain all of the aggregate statistics necessary for meta-analysis, which are often not included in original single study publications. However, the approach is resource intensive, involving contacting original study authors for data, time spent by those investigators and for complex statistical analyses. Bias can creep in if original data are not available. Furthermore, the individual data approach does not fix problems with the original studies.


The authors provide another innovative idea – prospectively planned meta-analyses, where investigators achieve consistency in interventions, outcomes, variable definition and data collection. In general, however, although individual participant meta-analyses can have advantages, aggregate meta-analyses can provide valid answers much of the time when reporting of individual studies is of high quality.

Selective reporting of outcomes in randomised trials can bias conclusions of systematic reviews

Commentary on:

Systematic reviews are generally viewed as the highest levels of evidence (aside from n=1 trials). But inadequate reporting could bias results. Investigators in England examined Cochrane collaboration systematic reviews to determine whether outcomes of interest were reported fully, partially or not at all in the included trials, after interviews with trial investigators and review of original articles.

Of 283 reviews, 55% did not include primary outcome data from all trials. Of 712 individual trials in these reviews, 359 may have had reporting bias – they stated the primary outcome results were not significant (without providing actual numerical data), they stated that the outcome was analysed but did not display results, or the result was measured or likely measured but not reported and was likely not significant. Among the 42 systematic reviews that reported statistically significant results, 8 of 42 became non-significant, and 11 of 42 overestimated the treatment effect by 20% or more.


Concerns about inadequate reporting are very serious. Such reporting can be biased, due to non-significance or conflict of interest. The message for trialists is to report prespecified outcomes. The message for meta-analysts is to include studies even if they don't report primary outcomes, and to seek unreported results from authors. Clinical trial registries can help assure complete reporting, and journal editors can also help by requesting original study protocols. But we will all need to use caution even in interpreting systematic reviews. Unfortunately, and unsatisfyingly, incomplete reporting is likely to impair conclusions from systematic reviews (not more than it does from the original trials, but importantly nonetheless) for the foreseeable future.

Prognosis research: standards need to be improved

Commentary on:

Not all important questions can be answered by randomised trials. Prognosis questions are among those that require other methods, and results, like those of clinical trials, can be summarised in systematic reviews. But as with systematic reviews of trials, the validity of such synthesis research depends on the quality of individual studies.

Several reports in the literature find that not all systematic reviews of prognostic markers address the quality of primary studies, and many, despite including large numbers of studies, are inconclusive. For example, after 168 reports including over 10 000 subjects, evidence was insufficient to determine the effect of a bladder cancer prognostic factor. In response, investigators have recommended 10 steps towards improving prognosis research:

  1. Clarify the goals and objectives of prognostic studies

  2. Identify priority for prognosis studies, of adequate sample size, registries and meta-analyses of individual participant data

  3. Publish study protocols (similar to those required for randomised trials)

  4. Clarify the strength of evidence required for a marker to be useful clinically and study a wide range of markers including those from routinely available data

  5. Define primary outcomes and include and clearly define patient-reported outcomes

  6. In general, improve the design conduct and reporting of prognostic studies

  7. Identify publication bias, prevent it, encourage study registration, include appropriate sample sizes

  8. Develop and adhere to reporting standards

  9. Improve identification, methods and reporting of systematic reviews of prognostic studies

  10. Study the effectiveness and cost-effectiveness of prognosis research results for improving clinical decisions and patient outcomes


Prognosis research is important for individual prediction and for identifying candidates for intervention studies. Prognosis research appears to be behind intervention and even diagnostic research in terms of quality, both validity and applicability. The same level of attention that has been put on interventions studies should be placed on prognosis research.

Random measurement error can lead to missing true associations, a methodological error that has been underappreciated

Commentary on:

Many research methods attend to minimising systematic errors. Such errors are consistently wrong in one direction. Less attention is paid by researchers to random measurement error, that averages out to zero. Despite seeming to cancel out, experts point out that such error can introduce important bias.

Results of tests of association can be biased towards the null (no effect) when there is random error in the exposure variable. Random error in the outcome variable can result in decreased precision, and making it less likely the result will be statistically significant.


One way to address random error in outcome variables is to increase the sample size and the number of measurements. But as the authors point out, “increasing the sample size will only make the estimates more precisely wrong.” The real solution is in the design phase of the study when researchers should use instruments capable of precise measurement or obtain frequent measurements.

Adjusting randomised trial results for when subjects do not follow the study treatment protocol

Commentary on:

Subjects in randomised trials may receive treatments to which they are not assigned, or may not receive treatments to which they have been assigned. Analysing the results according to the treatment they receive, or by omitting those who do not follow the protocol, removes the benefit of randomisation and can introduce error. Analysing according to assigned treatment can underestimate the value of the treatment, though it accurately estimates the effect of being assigned to a treatment.

Investigators propose a solution – the contamination adjusted intention to treat analysis (CA-ITT). In CA-ITT, an instrumental variable is used to adjust for receipt of treatment. An instrumental variable in this case is associated with treatment receipt but not outcome. The CA-ITT is an ITT analysis adjusted for the percentage of subjects assigned to treatment who actually receive it.


We often want to simply know the results of a study – did the treatment work or not, and how well did it work? A study in which all subjects take their assigned treatment can give us a simple answer. But results will usually be more complicated and not so simply summarised. If some subjects do not take their assigned treatment, then the result of an ITT remains valuable, but an ITT adjusted for the proportion who receives treatment will likely better approximate the effects one could expect in individual patients who take it.

What is in the placebo, and should we care?

Commentary on:

Because the content of a placebo could affect outcomes and interpretations of randomised trials, researchers reviewed study reports to determine how often the content was specified. They reviewed randomised, placebo-controlled trials published in the New England Journal of Medicine, JAMA, The Lancet and Annals of Internal Medicine published during the 2 years ending in 2009; 86 studies were of pills, 65 of injections and 25 of other methods. The researchers excluded studies that cited prior methods or results papers (though when included for secondary analysis, results were similar).

Most studies did not report the content of the placebo. Fewer pill studies (8%) disclosed placebo contents than injection (26%) studies or studies of other methods (eg, inhalers) (27%).


From this study, we don't know how serious the problem is – perhaps unreported details wouldn't have influenced study results. But placebos can go wrong in a number of ways, as the authors point out. First, they may contain an ingredient that is active for the condition under study (eg, olive oil in a study of a statin drug). Second, they may contain an ingredient that causes symptoms itself (eg, lactose, causing lactose intolerance symptoms, in a study of a medication being tested for its ability to reduce gastrointestinal symptoms). Third, a distinctive flavour, smell or texture could alert subjects to the fact that the placebo is not the active medication or is at least different, which could eliminate blinding and introduce bias. The authors recommend that both the active agent and the placebo be described in detail in reports of controlled trials, and that trial reporting guidelines be modified to include this recommendation.


View Abstract

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.