Article Text


Grading the quality of evidence in complex interventions: a guide for evidence-based practitioners
  1. M Hassan Murad,
  2. Jehad Almasri,
  3. Mouaz Alsawas,
  4. Wigdan Farah
  1. Evidence-based Practice Center, Mayo Clinic, Rochester, Minnesota, USA
  1. Correspondence to : Dr M Hassan Murad
    , Evidence-based Practice Center, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA; murad.mohammad{at}


Evidence-based practitioners who want to apply evidence from complex interventions to the care of their patients are often challenged by the difficulty of grading the quality of this evidence. Using the GRADE (Grading of Recommendations, Assessment, Development and Evaluation) approach and an illustrative example, we propose a framework for evaluating the quality of evidence that depends on obtaining feedback from the evidence user (eg, guideline panel) to inform: (1) proper framing of the question, (2) judgements about directness and consistency of evidence and (3) the need for additional contextual and qualitative evidence. Using this framework, different evidence users and based on their needs would consider the same evidence as high, moderate, low or very low.

Statistics from

Complex interventions in medicine are defined as interventions that contain several interacting components within the experimental and control interventions, multiple behaviours required by those delivering or receiving the intervention, multiple groups or organisational levels targeted by the intervention, and permit for variable flexibility and tailoring of the intervention.1 An example is care management programmes for type 2 diabetes. In one systematic review, programmes varied across studies in terms of team composition (physician, nurse, case manager, diabetes educator), delivery method (face-to-face, telephone, online, outpatient, inpatient), intensity and frequency of the intervention, and other factors.2 Complex interventions are increasingly used in studies of behavioural change, psychotherapy, education, public health, health services, quality improvement, social policy and many other fields.

The challenge

Just as with any other intervention, evidence-based practitioners who want to apply evidence from complex intervention to the care of their patients need to select the evidence (hopefully through a systematic review process) and appraise it (ie, identify the extent to which this evidence is trustworthy, also called, rating the quality of evidence). The GRADE (Grading of Recommendations, Assessment, Development and Evaluation) approach is a modern method for rating the quality of evidence with good transparency and reliability.3 A study of Cochrane systematic reviews showed that the outcomes of complex interventions were more likely to be rated as ‘very low’ quality of evidence compared with those of simple interventions (37.5% vs 9.1%). None of the outcomes of complex intervention reviews were rated as ‘high’.4 We believe these low ratings are inconsistent with the definition of quality of evidence in GRADE, which is a construct that reflects the trustworthiness of evidence, and we attribute this phenomenon to improper framing of the clinical question for which the quality of evidence is being rated.5


Complex interventions are inherently heterogeneous; thus, we may be tempted to lower the quality of evidence for heterogeneity (inconsistency across studies). Complex interventions are also likely to have components that are different from the ones we can or want to implement in our setting; thus, we may be tempted to lower the quality of evidence for indirectness (lack of applicability). Therefore, by rating down for heterogeneity and indirectness, most complex intervention studies will turn out to provide low-quality evidence; which is frustrating to evidence users.4

Proposed approach

We present a framework for evaluating the quality of evidence using GRADE. The framework hinges on obtaining feedback from the end user of evidence (eg, a guideline panel) during the process of evidence synthesis (eg, conducting a systematic review to support a guideline). This feedback seeks to inform: (1) proper framing of the question, (2) judgements about directness and consistency of evidence (the two domains that are highly relevant to complex interventions) and (3) the need for additional contextual and qualitative evidence to provide information about the circumstances under which the intervention works best (figure 1). This qualitative evidence can explain barriers and facilitators of implementation, as well as the cultural and social factors that can modify the effect of the intervention.

Figure 1

The traditional process of rating the quality of evidence is in white colour. The additional proposed steps that address complex interventions are in grey colour.

Although engaging patients and evidence users in the process of evidence synthesis is not new6 and is routinely carried out in some systematic reviews (eg, stakeholder engagement is mandated in reviews conducted by Evidence-based Practice Centers funded by the Agency on Healthcare Quality and Research),6 the role of this engagement has been to develop the scope and key questions. Here, we advocate for using their feedback to inform rating the quality of evidence of complex interventions. Finally, evidence synthesis methods other than traditional quantitative meta-analysis will likely be needed to address long causal chains in complex interventions, such as model driven and realistic systematic reviews and meta-narrative evidence synthesis.


The framework is applied through an illustrative example that demonstrates how the perspective of the evidence user affects rating of quality of evidence and shows how the same evidence can warrant different levels of trustworthiness for different users (table 1). For example, policymakers or payers (health insurance carriers) may not be concerned about clinical and statistical heterogeneity because they are interested in any management programme that improves diabetes control. They realise that different health systems under their jurisdiction may develop different programmes with variable components based on each system setting and resources (some delivered face-to-face, some delivered remotely, some led by nurse, etc). Therefore, from a policymaker perspective, there may be little interest in knowing more about the individual components of the intervention. This differs substantially from the patient and clinician perspectives because they may be interested in only choosing the most effective components of diabetes management that can fit in the patient capacity, schedule and daily routine. The third perspective presented in the table is that of a diabetes educator responsible for implementing the programme, which requires knowing much more details about implementation and contextual factors. Adopting this perspective leads to considering the evidence of effectiveness to be insufficient for the question at hand. Two qualitative studies7 ,8 were identified and provided some of the needed details. The studies shed light on the psychological factors that should be addressed when delivering diabetes management programmes and provided insight on the factors that affect adherence to such programmes (table 1). With these contextual factors known, the persons providing the programme will have increased confidence in their ability to deliver the programme and they may rate the quality of the evidence as high.

Table 1

Care management interventions for type 2 diabetes


We believe this approach will hopefully lead to more appropriate and possibly consistent judgements about studies of complex interventions and better application of GRADE. This approach, however, will not solve other challenges of complex interventions. For example, lack of blinding of outcome assessors and inadequate allocation concealment (both are feasible even if the study was unblinded and open) and poor reporting of the details of the intervention (which makes them hard to replicate in practice). These challenges can be solved at the primary research level, and not at the level of evidence synthesis, appraisal or application.



  • Contributors MHM conceived the idea and drafted the manuscript. JA, WF and MA critically revised the manuscript. All authors approved the submitted version.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; internally peer reviewed.

Request permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.