Article Text

Download PDFPDF

Catalogue of bias: observer bias
  1. Kamal Mahtani1,
  2. Elizabeth A Spencer1,
  3. Jon Brassey2,
  4. Carl Heneghan1
  1. 1 Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
  2. 2 Trip Database Ltd
  1. Correspondence to Dr Elizabeth A Spencer, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford OX2 6GG, UK; elizabeth.spencer{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Background to observer bias

Observer bias is any kind of systematic discrepancy from the truth during the process of observing and recording information for a study. Observer bias is a type of detection bias and can affect assessment in many kinds of study including observational studies and intervention studies such as randomised trials. Observer bias also relates to other biases, including experimenter bias and ascertainment bias, which will be explored in further articles.

The Dictionary of Epidemiology edited by Porta1 gives the following definition: ‘Systematic difference between a true value and the value actually observed due to observer variation’ and continues on to describe observer variation.

Many healthcare observations are open to systematic variation. For example, medical images reviewed by eye might lead one observer to tend to record an abnormality but another to tend to record no abnormality. Different observers might tend to round up or round down from a measurement scale. Colour change tests can be interpreted differently by different observers. Where subjective judgement is integral to the observation, such as recording behaviour among free-living individuals, there is great potential for variability between observers, and some of these differences might be systematic (ie, comprise bias). Observation of entirely objective data, such as number of deaths, is at much lower risk of observer bias.

Biases in recording objective data may result from poor training in the use of measurement devices or data sources, or unchecked bad habits. In recording subjective data, predispositions of the observer are likely to underpin observer biases.2 Observers might be somewhat conscious of their own biases in relation to a study or may be unaware of factors influencing their decisions when recording study information.

Randomised controlled trials (RCTs) are designed to provide the fairest test of an intervention. However, when any part of the data collection process involves observers, observer bias can affect these studies. The James Lind library, an online resource of information about fair tests of treatments in healthcare, states: ‘Biased treatment outcome assessment can result if people receiving or providing care, or others assessing treatment outcomes, know which participants have received which treatments. It is sometimes possible to conceal which treatments have been received by using placebos and in other ways’.3 Methods for minimising observer bias are discussed below (see section ‘Preventive steps’).


Observer bias has been repeatedly documented in studies of blood pressure. Clinicians measuring participants’ blood pressure using analogue mercury sphygmomanometers have been found to round up, or down, readings to the nearest whole number.4 Observer bias may also occur if the researcher has a preconceived idea of what the blood pressure ought to be, leading to arbitrary adjustments of the readings.5

Hróbjartsson and colleagues produced three systematic reviews estimating the size of the impact of observer bias by comparing estimates from studies in which outcome assessors were blinded to the intervention with those in which outcome assessors were not blinded to the intervention. Three types of RCTs were investigated: those with binary outcomes2; RCTs with measurement scale outcomes6 and RCTs with time-to-event outcomes.7 The included studies investigated interventions for a range of conditions from angina to wound treatment to psychiatric disorders. For RCTs with binary outcomes, non-blinded outcome assessors generated odds ratios that, on average, were exaggerated by 36%.2 For clinical trials that used measurement scale outcomes, non-blinded outcome assessment exaggerated the pooled effect size by 68%.6 For RCTs using time-to-event outcomes, non-blinded assessment exaggerated the hazard ratio by approximately 27%.7

Preventive steps

A key method is to ensure that outcome assessors are masked to the exposure status of study participants. This can apply to intervention studies such as RCTs, in which an individual has been allocated a particular intervention, and also to observational studies, which track the progress of study participants with different exposures. Achieving this masking might mean separating access for data on exposures from data on outcomes; in a blinded RCT, the allocation should remain unknown throughout the study (unless it must be revealed for safety reasons). The use of matched placebos in RCTs aims to remove bias from knowing which study participant is allocated to which intervention.

To complement this, strategies can include adequate training for observers in how to record their findings, identifying any potential conflicts of interest within observers before recordings commence and clearly defining the methods, tools and time frames for collecting their findings.

Another aspect of training can include trying to help study observers to become aware of their prejudices and habits in order to improve accuracy of the data. In the area of blood pressure measurement, Bruce and colleagues investigated the pattern of observer bias, and how well training procedures designed to reduce observer bias lasted over a period of months.8 Their study used sphygmomanometers (inflatable cuff and pressure reading devices) for blood pressure measurement, which depend on the observer to listen for sounds and decide at which moment to record systolic and diastolic pressures. There is evident potential for observer variation in recording blood pressure and for systematic variation leading to bias. Over the 30 months of this study, nurses were trained to measure blood pressure: prior to the study, and again after 8, 21 and 28 months. The training used recordings of various example of blood flow sounds in order to practice identifying the correct point to record systolic and diastolic pressures. The study showed that individual nurses had specific tendencies (biases) in either under-reporting or over-reporting blood pressure; training did somewhat reduce the between-nurse variation, but differences did remain and those tendencies resisted alteration by the training at various time points. The authors pointed out that biases could arise from the nurses’ interpretation of the blood pressure device, but also from interactions with the study participants that may actually have influenced blood pressure. This illustrates some of the difficulties in dealing with observer bias.

As this study shows, while observer bias can be reduced, it is likely that some observer bias will always remain, and researchers should be aware of this when analysing and evaluating data.


Observer bias is an important consideration in any study where observers are required to record outcomes or exposure data. Logically this is particular hazard in studies recording subjective factors for which the observer must use individual judgement to decide what to record. Even in the fairest tests of treatments— RCTs—observer bias can affect the results.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.


  • Contributors All authors contributed to literature searches, discussions and the writing of this article.

  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.