Statistics from Altmetric.com
In October 2001, the Nordic Cochrane Centre published a Cochrane review of mammography screening, which questioned whether screening reduces breast cancer mortality.1 Within the same month, the Centre published a more comprehensive review in Lancet that also reported on the harms of screening and found considerable overdiagnosis and overtreatment (a 30% increase in the number of mastectomies and tumourectomies).2 This resulted in a heated debate, which is still ongoing.3 The Cochrane review was updated in 2006,4 to include overdiagnosis, and again in 2009.5
Recently, several studies have questioned whether screening is as beneficial as originally claimed,6–8 and confirmed that overdiagnosis is a major harm of breast cancer screening.9–11 The US Preventive Services Task Force published updated screening recommendations in November 2009 and asserted that the benefit is smaller than previously thought and that the harms include overdiagnosis and overtreatment, but it did not quantify these harms.12 The task force changed its previous recommendations and now recommends that women aged 40–49 years discuss with their physician whether breast screening is right for them, and it further recommends biennial screening instead of annual screening for all age groups.12 These recommendations were repeated in the 2011 Canadian guidelines for breast screening.13
Screening is likely to miss aggressive cancers because they grow fast, leaving little time to detect them in their preclinical phases.6 Further, the basic assumption that finding and treating early-stage disease will prevent late stage or metastatic disease may not be correct, as breast cancer screening has not reduced the occurrence of large breast cancers14 or late-stage breast cancers,11 despite the large and sustained increases in early invasive cancers and ductal carcinoma in situ with screening.
A systematic review from 2009 showed that the rate of overdiagnosis in organised breast screening programmes was 52%, which means that one in three cancers diagnosed in a screened population is overdiagnosed.9 It is quite likely that many screen-detected cancers would have regressed spontaneously in the absence of screening.15 ,16
We explored how the first comprehensive systematic review on mammography screening ever performed, the one from 2001 published in Lancet,2 and the subsequent systematic Cochrane reviews from 20064 and 20095 have been cited from 2001 to April 2012. We investigated whether there were differences between general medical journals and specialty journals regarding which results were mentioned and how overdiagnosis, overtreatment, breast cancer mortality, total mortality, and the methods of the reviews were described. Vested interests on behalf of both journals and contributing authors may be more pronounced in specialty journals, and this may influence views on specific interventions, such as mammography screening.
We searched for articles quoting one of the three versions of the review2 ,4 ,5 (date of last search 20 April 2012). We used the ‘source titles function’ in the Institute for Scientific Information (ISI) Web of Knowledge to count the number of times each review had been cited in individual journals. We only included journals in which four or more articles had cited one of the three versions of the review. This criterion led to the exclusion of specialty journals of little relevance for our study, for example, Nephrology and Research in Gerontological Nursing. Articles written by authors affiliated with the Nordic Cochrane Centre were also excluded.
We could not include the 2001 Cochrane review1 because it was not indexed by the ISI Web of Knowledge. Furthermore, even if it had been indexed, we would have excluded it. This version of the review1 is not comparable to the other three versions,2–5 as the editors of the Cochrane Breast Cancer Group had refused to publish these data on overdiagnosis and overtreatment.
A journal was classified as a general medical journal if it did not preferentially publish papers from a particular medical specialty. A journal was classified as a specialty journal if it preferentially published articles from a particular medical specialty or topic.
When we rated how the papers cited the review, we looked for statements applicable to the following categories:
Breast cancer mortality
Methods used in the review
We rated the quoting articles’ general opinions about the results and methods of the review using the labels—accept, neutral, reject, unclear, or not applicable, using the following definitions:
Accept: the authors explicitly agreed with the results or methods, or quoted the numerical results without comments.
Neutral: the results or methods were mentioned and the author presented arguments both for and against them.
Reject: the authors explicitly stated that the results or methods were flawed, wrong, or false, or only presented arguments against them. Only reporting a result from a favourable subgroup analysis was also classified as rejected.
Unclear: the results or methods were mentioned, but it was not possible to tell if the authors agreed with them or not, or the results were only mentioned qualitatively. If several conflicting opinions were presented, it would also be classified as unclear.
Not applicable: the review was quoted for something else than its results or methods.
The articles quoting the review were assessed in relation to the five categories (overdiagnosis, overtreatment, breast cancer mortality, total mortality and methods) separately, and no overall assessment of the articles’ general opinion about the review was made.
Texts classified as not applicable regarding any of the five categories were reread to determine and note which topics were discussed.
Two researchers (KR, Andreas Brønden Petersen, see Acknowledgements) assessed the text independently. Disagreements were settled by discussion.
In order to ensure blinded data extraction, an assistant (Mads Clausen, see Acknowledgements) not involved with data extraction identified the text sections citing one of the three review versions and copied them into a Microsoft Word document. Only this text was copied, and the two data extractors were therefore unaware of the author and journal names, time of publication and the title of the article. The fonts of the copied text were converted into Times New Roman, saved in a new document and the text labelled with a random number using the ‘Rand function’ in Microsoft Excel. The key to matching the text with the articles was not available to data extractors until the assessments had been completed. The person responsible for copying the text made sure it did not contain any information that might reveal which of the three versions of the review had been cited. When there was more than one reference within the copied text, the reference to the review was highlighted to make it clear which statements referred to the review.
All article types, as well as letters to the editor, were included and were classified as research papers, systematic reviews, editorials, letters, guidelines and narratives.
p Values were calculated using Fisher's exact test (two-tailed p values (http://www.swogstat.org/stat/public/fisher.htm)).
In total, 523 articles cited one of the three versions of the review: 360 cited the 2001 Lancet review,2 123 the 2006 Cochrane review4 and 40 the 2009 Cochrane review.5 Three articles cited both the 2001 and the 2006 versions of the review; for these, we only used information related to the 2001 citation.
Including only journals that had published at least four articles, which cited one or more of the three versions of the review, the search identified 151, 27 and 15 articles, respectively (193 in total, or 37% of the total of 523 articles). A flow chart is shown in figure 1.
We excluded 22 additional articles, two because there was no reference to the review in the text, even though the review was listed as a reference,17 ,18 and 20 (10, 5 and 5 citing the 2001, 2006 and 2009 versions, respectively) because they had one or more authors affiliated with the Nordic Cochrane Centre.
Thus, 171 articles were included for assessment. In total, 63 articles (37%) were from general medical journals and 108 (63%) from specialty journals. A total of 80 (47%) were from European journals and 91 (53%) from North American journals. No journals from other regions contained at least four articles citing the review.
The general medical journals included were Lancet (21 articles), BMJ (13 articles), Annals of Internal Medicine (13 articles), Journal of the American Medical Association (7 articles), New England Journal of Medicine (5 articles) and International Journal of Epidemiology (4 articles). The specialty journals included Journal of the National Cancer Institute (13 articles), Cancer (13 articles), American Journal of Roentgenology (7 articles) and 15 others (see box 1). Most of the included articles were either research papers (n=63, 37%) or narrative articles (n=44, 26%; table 1).
The specialty journals included in this study
Specialty journals included
Journal of the National Cancer Institute (13)
European Journal of Cancer (7)
British Journal of Cancer (7)
American Journal of Roentgenology (7)
Cancer Causes and Control (6)
Annals of Oncology (6)
European Journal of Surgical Oncology (6)
Journal of Medical Screening (5)
Cancer Epidemiology, Biomarkers and Prevention (5)
CA: a Cancer Journal for Clinicians (5)
Journal of Clinical Oncology (5)
Radiologic Clinics of North America (5)
Breast Cancer Research and Treatment (4)
Journal of Surgical Oncology (3)
The text of 32 of the 171 included articles (19%) was rated as not applicable for all the five categories (overdiagnosis, overtreatment, breast cancer mortality, total mortality and methods). In total, 15 of these 32 articles discussed the controversy when the first review was published, without specifically mentioning any of the categories. Other subjects discussed were screening of women under the age of 50 (two articles), and benefits of breast cancer screening other than those in our categories (two articles; see online supplementary appendix 1 for a full list of topics).
The review’s conclusions regarding overdiagnosis were not quoted in 87% (149/171) of the included articles and the results for breast cancer mortality were not quoted in 53% (91/171) of the included articles.
General medical journals were more likely to accept the results or methods of systematic reviews than specialty journals, for example, overdiagnosis was classified as accepted in 11% (7/63) of articles in general medical journals, but in only 3% (3/108) of the articles in specialty journals (p=0.05), and the methods were accepted in 14% (9/63) of articles in general medical journals, but only in 1% (1/108) of articles in specialty journals (p=0.001). Specialty journals were also more likely to reject the results for breast cancer mortality, namely for 26% (28/108) of articles compared with 8% (5/63; p=0.02) in general medical journals. The differences between general medical and specialty journals in relation to rejecting the categories overdiagnosis, overtreatment, total mortality and methods were small (table 2).
The European and North American journals were equally likely to reject or accept the review's methods or results (data not shown).
The number of citations of the three versions of the review differed a lot over time (see table 3). Some years had very few citations, the lowest being 2012 and 2006 where the review was cited only 1 and 6 times, respectively. The highest number of citations was in 2002 (42 citations). There were no clear trends over time regarding the number of articles accepting or rejecting the methods and conclusions of the reviews, although the breast cancer mortality results may have received greater acceptance in recent years, for example, in 2002, there was no acceptance of the breast cancer mortality results (0 of 42), whereas 19% (3/16) explicitly accepted them in 2010 (p=0.02; data not shown).
The 2001 version of the review had more categories rejected and fewer categories accepted than the 2006 and 2009 versions, for example, 30% (3/10) accepted the results for breast cancer mortality presented in the 2009 version of the review, compared with 0 (0/140) in the 2001 version (p=0.0002; see table 4).
Although we deliberately reduced the sample size by requiring at least four citations for each included journal, we had enough articles that quoted the review for our comparisons.
Specialty journals were more likely to reject the estimate of the effect of screening on breast cancer mortality than the six general medical journals we included.
Articles in general medical journals were also more approving of four of the five individual categories we assessed (overdiagnosis, overtreatment, total mortality and methods) than the specialty journals were and the difference was statistically significant for all the categories, except for breast cancer mortality.
We have previously found that scientific articles on breast screening tend to emphasise the major benefits of mammography screening over its major harms and that overdiagnosis was more often downplayed or rejected in articles written by authors affiliated with screening by specialty or funding, compared with authors unrelated with screening.19 Recommendations in guidelines for breast screening are also influenced by the authors’ medical specialty.20
The difference we found between the general medical and specialty journals could be explained by conflicts of interest, which are likely to be more prevalent in specialty journals owned by political interest groups such as the American Cancer Society or by medical societies with members whose income may depend on the intervention. All the six general medical journals, but only 22% (4/18) of the specialty journals follow the International Committee of Medical Journal Editors’ (ICMJE) Uniform Requirements for Manuscripts Submitted to Biomedical Journals.21 Even though journals have conflict of interest reporting policies, the conflicts of interest reported are not always reliable.22
All the general medical journals included are members of the World Association of Medical Editors (WAME); however, this is only the case for 22% (4/18) of the specialty journals included. WAME aims to improve the editorial standards and, among other things, to ensure a balanced debate on controversial issues.23 Being a member of WAME helps with transparency in terms of their guidelines for conflicts of interest, but it also reminds editors to ensure that their journals are covering both sides of a debate.
Development over time
The results and conclusions on breast cancer mortality and overdiagnosis were more often accepted in 2010 than in any other year (data not shown). This may reflect that the criticism of breast screening is becoming more widespread. The ongoing independent review of the National Health Service (NHS) Breast Screening Programme announced by Mike Richards, the UK National Clinical Director for Cancer and End of Life Care, Department of Health, in October 2011 is a further indication of this development.24 Also, the US Preventive Services Task Force changed its recommendations for breast screening in 2009.12 Though our data did not show strong time trends, we believe that these developments demonstrate a growing acceptance of the results and conclusions of our systematic review. In support of this, the 2009 version of the Cochrane review has received more approval than disapproval, for example, 30% (3/10) accepted the results for breast cancer mortality presented in the 2009 version of the review, compared with 0 (0/140) in the 2001 version.25–31 The US Preventive Services Task Force was heavily criticised after the publication of its new recommendations in 2009,29 ,32 but the criticism came from people with vested interests, and the independent Canadian Task Force supported the conclusions of the US Preventive Services Task Force and the 2009 Cochrane review5 in 2011.13
The 2001 review published in Lancet was by far the most cited of the three reviews. It was 5 years older than the Cochrane review from 2006, but the vast majority of the citations came within the first year of publication. It was unique at the time, as it questioned whether mammography screening was effective, based on a thorough quality assessment of all the randomised controlled trials, and also was the first systematic review to quantify overdiagnosis.
A minor part of the included articles (19%, 32/171) did not refer to any of our five specified outcomes. In nearly half of the cases (47%), this was due to the article referring only to the debate that followed the first review,33 and not its results or methods. The texts also dealt with topics such as false positives or screening women under the age of 50 years. The articles also simply stated that mammography screening was beneficial without further specification. The most frequently used classification for each of our specified categories was not applicable. This was the case for articles in both the general medical and specialty journals, and for articles in the European and North American journals. The text typically dealt with only one or two of our categories, for example, overdiagnosis, and did not mention overtreatment or any other categories.
None of the articles rejected overdiagnosis (0 of 171 articles), which could be because they did not mention the issue at all. This was the case in 76% of scientific articles on breast screening in a previous study by Jørgensen et al.19
Our definition of rejection was that the author should explicitly state that the review's estimate was flawed, wrong or false, or that they should in some way argue against it. With this strict definition, we did not capture authors who have consistently stated over the years in other articles than those we included that they do not believe that overdiagnosis is a problem, and we also did not present their views on the subject.
Numerous articles were classified as unclear for one or more of our categories. The texts in question did not allow an interpretation in any direction and we did not rate the articles as accepting or rejecting the review’s results and methods unless it was perfectly clear what the authors meant. This reflects that authors often do not present clear opinions of the intervention which they discuss. An additional explanation for the many articles found to be unclear could be that we did not assess the entire article, and arguments could have been presented elsewhere in the text.
Letters were included in this study, which could explain why some of the articles were classified as not applicable in all the five categories. The specialists who read and respond to letters in their own journals might be more likely to react negatively towards the review because of conflicts of interest.19 Specialists with a connection to mammography screening also reply to articles in general medical journals when they concern mammography screening. Therefore, it is quite likely that there is a greater difference between the specialists involved with the screening programmes and the doctors not involved in breast cancer screening, in terms of accepting and rejecting the results and methods, than we have found in this study.
Articles in specialty journals were less approving of the results and methods of the systematic review of breast screening than those in general medical journals. This may be explained by conflicts of interest, as several specialty journals were published by groups with vested interests in breast screening, and several articles had authors with vested interests.
We would like to thank Andreas Brønden Petersen and Mads Clausen for assisting us in preparing the text and extracting data.
Contributors KR participated in the design of the study, carried out data analysis, performed statistical analysis and drafted the manuscript. KJJ and PCG both participated in the design of the study and helped to draft the manuscript. All authors read and approved the final manuscript.
Competing interests None.
Open Access This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.