Statistics from Altmetric.com
Produced by panels of renowned experts according to formal processes and rules, evidence-based guidelines are considered unbiased and valid, having the same level of certainty of the conventional scientific method.1 However, in spite of the efforts set forth to produce reliable guidelines, several concerns about their trustworthiness have been recently raised.2 Although the exact magnitude of this phenomenon is still unknown, it is essential to establish the degree and impact of unintended and harmful clinical effects triggered by the adoption of flawed guidelines, and moreover, the implications of the significant waste of resources, and generalised damage to the evidence-based ‘quality mark’. Understanding why and how often guideline errors occur will encourage users to cautiously handle clinical guideline recommendations and will promote the use of different strategies to tackle this challenge successfully.
When is a clinical guideline wrong?
Formulating a judgement on the validity of a guideline is not straightforward, since producing a guideline is a very complex process involving technical skills (searching for primary evidence efficiently), value judgements (rating that evidence) and social aspects (managing discussion and achieving consensus within the guideline panel group).3 Broadly speaking, any guideline failing to offer the right advice should be considered erroneous and, conversely correct ‘if, when followed, they lead to the health and cost outcomes projected for them, with other things being equal’.4 However, judging guidelines only once the effects derived from their adoption are known, is rarely possible. More often, we consider to what extent ‘the projected health outcomes and costs of alternative courses of action, the relationship between the evidence and recommendations, the substance and quality of the scientific and clinical evidence cited, and the means used to evaluate the evidence’4 are convincing. That is how we measure the reliability of guidelines assessing the methods followed for producing them (methodological trustworthiness) and/or their content, whether primary evidence was correctly searched, evaluated, synthesised and translated onto a given recommendation (content trustworthiness).
Epidemiology of untrustworthy guidelines
Irrespective of how we define their reliability, an ‘epidemiology’ of wrong guidelines still needs to be written (see online supplementary file). Interestingly, claims of methodological untrustworthiness were raised since their first appearance. In 2000, only 22 of 431 (5%) guidelines screened by Grilli et al5 fulfilled 3 basic quality criteria, whereas 221 (54%) of them did not meet any quality criterion. Similarly, the mean overall adherence to a more complex quality checklist was 47% among a set of 279 guidelines in another study published in 1999.6 Quality did not subsequently improve, with little or no progress found over the course of the next two decades, since in 2012 less than half of 130 guidelines met more than 50% of the Institute of Medicine (IOM) standards,2 a finding independently confirmed. Content trustworthiness was not assessed to the same extent, but substandard results have been frequently reported.
Overall, a conservative estimate is that 50% of current evidence-based guidelines suffer from either methodological flaws, have questionable content with respect to the primary evidence to which they refer to or documented outcomes diverging from those expected. On average, guidelines sponsored by medical specialty societies were and still continue to be of lower quality compared with those endorsed by national health agencies.
Why do errors occur in evidence-based guidelines?
Early consensus-based guidelines considered evidence in a variable and unpredictable way and were particularly at risk of errors, whereas more recent evidence-based guidelines should ensure more balanced and reliable recommendations (figure 1). However, despite the desirable features of these newer guidelines produced since the early 1990s,7 their quality remained largely unsatisfactory, with the occurrence of one or more of the following factors related to the guideline making process: (1) limited and unbalanced panel composition with excess of specialists and content experts favouring new treatments and interventions disproportionately,8 (2) stacking of panels with experts with (known) prejudices about what was to be evaluated,9 (3) lack of formal consensus management methods within the panel groups with prevalence of dysfunctional decision paths, (4) oversimplified, opaque and inconsistent methods for rating evidence and making consistent, clear and useable recommendations, (5) failure to capture the impact of differing patients' values and perspectives, multiple morbidities, and chronicity, (6) publication bias, (7) temporal gap between current ‘best’ evidence and that from out-dated studies, (8) conflict of interests10 and (9) the absence of peer-review procedures regarding other papers published in biomedical journals, of pivotal testing of draft versions of guidelines by users, and external review by independent experts.
Additional reasons usually neglected and extraneous to the guideline making process need to be taken into consideration when judging a guideline's reliability. First of all, the elusive nature of evidence is barely acknowledged. Colloquially defined as ‘anything that establishes a fact or gives reason for believing something’,11 ‘evidence’ has in fact different meanings for researchers, clinicians or policymakers. These varying forms of evidence will not combine by themselves to produce health system guidance; combining and interpreting them requires a deliberative process.12 Indeed, not rarely guidelines evaluating the same body of evidence have produced differing and even conflicting ‘evidence-based’ recommendations.
Furthermore, the quality of evidence has been weakened by the ‘avoidable waste’ of biomedical research, focusing on low priority questions, neglecting to address important outcomes, using inappropriate design and study methods, under-reporting studies with disappointing results, bias, and assessing incomplete and misleading reports of the outcomes.13 Most new research is not produced and interpreted in the context of the existing evidence. The pervasive influence of undue interests,14 disease mongering, overdiagnosis and overtreatment15 underlie the entire research and development process behind the trial system which is broken according to some,16 and that fatally leads to the corruption of the evidence-based medicine movement itself17 and of its most typical product: the guideline.
Evidence-based guidelines should be considered a valuable support tool for practitioners searching for answers to clinical questions. By using guidelines, they will review the best acknowledged summary of information to date and evaluate if the selected recommendations are adequate to the specific clinical situation they are facing. However, guideline reliability is largely overstated, and guidelines still suffer methodological flaws, limited panel composition and conflicts of interests, making their conclusions often untrustworthy. Even when evidence-based methodology is claimed, it is often not fully adopted and the ‘evidence-based quality mark’ gets misappropriated by vested interests.18 The drug industry controls and funds most research and this big ‘commissioning bias’ explains how so much flawed evidence has been produced, incorporated into guidelines and used to fuel, not to combat, overdiagnosis and overtreatment,19 under the strong influence the magic term ‘evidence’ has on the prescribing habits of so many physicians. So, reliable and trustworthy guideline production is undermined by low-quality biomedical research, whatever the accuracy and methodology level reached by the even most honest and rigorous guideline producers. The crisis of guideline errors is part of the wider issue the evidence-based medicine movement is facing with the contamination of the scope, ethical integrity and relevance of biomedical research, which should produce only evidence that matters.
A ‘public marketplace’ of evidence-based recommendations where guideline users can gravitate towards the most highly rated and reliable of them has been evoked.20 Unfortunately, such virtual or physical places remain in their infancy to date. Furthermore, no official, publicly accountable, reliable, independent and unconflicted rating agency of published guidelines exists. Therefore, average guideline users are left unsupported when judging the trustworthiness of guidelines. In the next paper, we will address how readers can become more informed about the value of guidelines they are considering and what to do in the case of flawed recommendations.
Twitter Follow Primiano Iannone at @primianoiannone
Contributors PI is responsible for conception and design of the work, stewardship of the discussion among the authors, deep analysis of all of the literature considered and of contributes of the authors; draft of the manuscript. NM, MM and AC are responsible for substantial contributions the acquisition, analysis of the relevant literature, interpretation of data and of critical appraisal the literature; revision of the manuscript for intellectual content; final approval of the version submitted. JD and PC are responsible for substantial contributions to the analysis interpretation of relevant literature considered, focused assessment of factors influencing guideline trustworthiness; critical revision of the manuscript for important intellectual content. Final approval of the version submitted. All of the authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Competing interests None declared.
Provenance and peer review Not commissioned; internally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.