- Open Access
- Open Peer Review
Conclusions in systematic reviews of mammography for breast cancer screening and associations with review design and author characteristics
Systematic Reviews volume 6, Article number: 105 (2017)
Debates about the benefits and harms of mammography continue despite the accumulation of evidence. We sought to quantify the disagreement across systematic reviews of mammography and determine whether author or design characteristics were associated with conclusions that were favourable to the use of mammography for routine breast cancer screening.
We identified systematic reviews of mammography published between January 2000 and November 2015, and extracted information about the selection of evidence, age groups, the use of meta-analysis, and authors’ professions and financial competing interest disclosures. Conclusions about specific age groups were graded as favourable if they stated that there were meaningful benefits, that benefits of mammography outweighed harms, or that harms were inconsequential. The main outcome measures were the proportions of favourable conclusions relative to review design and author characteristics.
From 59 conclusions identified in 50 reviews, 42% (25/59) were graded as favourable by two investigators. Among the conclusions produced by clinicians, 63% (12/19) were graded as favourable compared to 32% (13/40) from other authors. In the 50–69 age group where the largest proportion of systematic reviews were focused, conclusions drawn by authors without financial competing interests (odds ratio 0.06; 95% CI 0.07–0.56) and non-clinicians (odds ratio 0.11; 95% CI 0.01–0.84) were less likely to be graded as favourable. There was no trend in the proportion of favourable conclusions over the period, and we found no significant association between review design characteristics and favourable conclusions.
Differences in the conclusions of systematic reviews of the evidence for mammography have persisted for 15 years. We found no strong evidence that design characteristics were associated with greater support for the benefits of mammography in routine breast cancer screening. Instead, the results suggested that the specific expertise and competing interests of the authors influenced the conclusions of systematic reviews.
Mammography is the most widely used screening technology for detecting breast cancers in asymptomatic women. Since its introduction, the relative harms and benefits of mammography have been the subject of ongoing debate. Both the age at which to begin breast cancer screening and the frequency of screening have been disputed. Conflicting recommendations persist despite decades of interventional and observational studies that are used as the basis for making recommendations [1,2,3,4], and cancer screening guidelines generally fail to quantify benefits and harms in a balanced way . For mammography, the debate was renewed in 2009 when the US Preventive Services Task Force (USPSTF) revised their guidelines to initiate biennial screening at 50 years of age instead of 40 . In 2012, the National Health Services (NHS) in the UK recommended mammography once in 3 years to women aged 47–73 years . In late 2015, the American Cancer Society (ACS) updated their guidelines to initiate annual screening at 45 and reduce the frequency to biennial screening at 55 . In early 2016, the USPSTF again examined the evidence and recommended biennial screening for women between the ages of 50 and 74 . Conflicting recommendations about breast cancer screening make it difficult for clinicians and patients to make informed choices about when to start and how often to repeat mammography for women at average risk.
Inconsistencies in the conclusions produced across systematic reviews may reflect the manner in which the reviews were designed and undertaken, or may be related to the expertise and competing interests held by the reviewers. Analyses in other domains have found that financial conflicts of interest can be associated with differences in the interpretation of the results when drawing conclusions and making recommendations [10,11,12]. Specific to mammography, an analysis of mostly primary studies showed that authors who worked directly in mammography screening were more likely to downplay or reject over-diagnosis than other authors . Another study examining 12 clinical practice guidelines showed that guidelines authored by radiologists or where lead authors had recent publications on diagnosis and treatment were more likely to recommend routine screening .
By measuring review design and author characteristics associated with certain conclusions in systematic reviews of mammography, we may be able to identify factors that are associated with inconsistent conclusions. Our aim was to quantify the degree of disagreement in the conclusions of systematic reviews of mammography for breast cancer screening and then measure associations between the conclusions, design characteristics of the reviews, and the professions and financial competing interests of the authors.
We examined systematic reviews of mammography for breast cancer screening published between January 2000 and November 2015. The PubMed database was searched to identify relevant studies using the following search strategy: “(((((guideline*[Text Word]) OR recommend*[Text Word]) OR review[Publication Type])) AND (((((“mass screening”[MeSH Terms]) OR (“early detection of cancer” [MeSH Terms]) OR screen*[Text Word])))) AND (((mammogr*[Text Word]) OR (breast cancer screening[MeSH terms]) OR (mammography[MeSH Terms])))”. We additionally hand-searched the citations of included reviews after screening to identify any other relevant literature.
Selection of reviews
Review articles were included in the analysis if they met all four of the following inclusion criteria: (a) the reviewers specified a search criteria and the databases in which the search was conducted; (b) the review was focused on mammography for breast cancer screening; (c) at least two primary studies addressing the harms or benefits of mammography were cited; and (d) the reviewers made conclusions about the harms or benefits of mammography for breast cancer screening in relation to the evidence. Outcomes related to benefits included breast cancer survival (mortality reduction), and cost-effectiveness of screening for quality-of-life. Outcomes related to harms included over-diagnosis, false positives, unnecessary treatments, radiation cancers, anxiety or worry, and pain or discomfort. Reviews that examined only diagnosis endpoints without considering survival or harms were excluded, as were reviews that only considered high-risk populations or populations of women who had previously been diagnosed with breast cancer. Articles were also excluded if they were guidelines, no longer archived or accessible online, not peer reviewed, or were in a language other than English.
Screening and data extraction
Two investigators independently screened article titles and abstracts against the inclusion criteria, and then examined the full text of articles against the inclusion and exclusion criteria. Discrepancies were resolved by discussion at both stages.
The review design characteristics extracted included patient age ranges covered in the evidence, the types of primary studies analysed, the set of outcome measures examined, the presence or absence of a meta-analysis, and the year the systematic review was published. Patient age ranges were assigned to one of four categories: 49 or under, 50 to 69 years, 70 years or older, and one other group for reviews that considered all ages or did not specify an age range. These age groups were selected to correspond with the common age ranges used in the most recent guidelines. The age group for women aged between 50 and 69 in particular is where there has been substantial disagreement about how often women should undergo mammography, and this was a focus of our study. Where age ranges differed from the four groups, we identified the group with the largest overlap (see Additional file 1). The types of studies included in the review were classified into controlled trials, observational studies, both forms of primary studies, or cost-effectiveness analyses.
We recorded the professional role or specialty of all individual authors and categorised them as clinicians or non-clinicians. Professional role was determined by the affiliation listed on the article, employment history, qualifications, and listed research interests. These elements were identified and interpreted from the systematic review and biographies on institutional webpages where available. Qualifications for clinicians included MD or equivalent degrees, and non-clinical qualifications included PhD and MPH. Where corresponding authors had both clinical and non-clinical qualifications, we assigned them to the clinician group if we identified a clinical affiliation on institutional webpages or recent publications, and to the non-clinician group if we could find no such evidence. Where authors were all clinicians or all non-clinicians, we labelled the review as such, and where the authors had a mix of the two types of professional roles, we labelled the review according to the professional role of the corresponding author, under the assumption that the corresponding author takes primary responsibility for the conclusions drawn in the review (Additional file 2).
A financial competing interest disclosure may have described research funding, ownership, or fees from a developer of mammography systems or software. Financial competing interests were determined from disclosure statements in the article. If a competing interest was identified and was financial in nature, we assumed it to be relevant and labelled all conclusions in the systematic review as associated with a financial competing interest. We also noted the presence or absence of a disclosure statement in the systematic reviews and labelled systematic reviews without a disclosure separately.
Two investigators read each included systematic review in its entirety to evaluate the mammography recommendations contained in the conclusions. Each conclusion was judged as favourable or non-favourable by assigning it to one of eight types. Four types were considered to be favourable (evidence of benefits, benefits outweigh harms, the practice is cost-effective, no evidence of harms), and four were labelled as non-favourable (evidence of harms, harms outweigh benefits, the practice is not cost-effective, no evidence of benefits). A third investigator read any reviews for which there was a disagreement to produce a final grading. For each conclusion, we also extracted supporting conclusion statements from the systematic reviews, as well as any statements that made recommendations about who should undergo screening by mammography and how often it should be done.
Some of systematic reviews examined evidence for different age ranges or frequency of screening separately, producing conclusions for each. These conclusions were considered separately. This means that systematic reviews may be represented in the analysis with more than one conclusion and those conclusions may differ.
A linear regression was used to check whether favourable conclusions became more or less common over the period of study, testing whether changes in the primary evidence influenced the likelihood of a favourable conclusion in systematic reviews. Where appropriate, we performed chi-square tests to test the association between favourable conclusions and each of the review design (evidence selection, age groups, outcomes, or the use of meta-analysis) and author (professional roles and financial competing interests) characteristics extracted from the systematic reviews.
The search returned 2726 publications, from which 50 systematic reviews met the inclusion criteria (Fig. 1). Five systematic reviews included separate conclusions for two or more age groups, yielding a set of 59 conclusions and 42 corresponding authors.
Summary systematic review characteristics
The greatest number of conclusions were for women aged between 50 and 69 (22 conclusions), and for which there was no restriction on age range in the evidence cited or where no age range was specified (also 22 conclusions). Of the remainder, 17% (10/59) considered women under 49, and 8% (5/59) considered women aged 70 and older (see Additional file 5).
Evidence selection also varied across the conclusions. Conclusions based entirely on randomised controlled trials comprised 20% (12/59). Another 41% (24/59) considered both randomised controlled trials and observational studies, 34% (20/59) considered only observational studies, and 5% (3/59) were based on cost-effectiveness studies. Mortality outcomes were used in 64% (38/59) of the conclusions. Meta-analyses supported 27% (16/59) of the conclusions.
Most conclusions were associated with author groups that had a public health, epidemiology, or biostatistics (68%; 40/59) background. Among the 32% (19/59) of corresponding authors from clinical specialties, 6 were from oncology, 4 from radiology, and 10 from other medical specialties (see Additional file 6). Disclosures of relevant financial competing interests were present in 14% (8/59) of conclusions.
The number of systematic review conclusions produced over the period peaked between 2004 and 2007, and increased again from 2010 (Fig. 2). Fitting a linear regression to the proportion of favourable conclusions per year revealed no clear increase or decrease in the proportion of favourable conclusions over time (r 2 = 9.73 × 10−3; p = 0.716).
Associations between systematic review characteristics and conclusions
Of the conclusions written by corresponding authors from clinical specialties, 63% (12/19) were graded as favourable (Table 1). A greater proportion of the conclusions made in systematic reviews written by clinicians were favourable compared to conclusions by corresponding authors from a public health and epidemiology profession, for which 32% (13/40) of conclusions were graded as favourable. Among the conclusions associated with a disclosed financial competing interest, 75% (6/8) were favourable. In comparison, 47% (9/19) were graded as favourable when there was no disclosure, and 31% (10/32) were graded as favourable when there was an explicit declaration of no financial competing interests. The proportions of favourable conclusions relative to combinations of clinical profession and financial competing interest disclosures revealed a consistent pattern in which corresponding authors who were clinicians or had financial competing interests more often produced favourable conclusions (Fig. 2).
Among the conclusions drawn from reviews of randomised controlled trials, 42% (5/12) were graded as favourable; compared to 29% (7/24) of conclusions based on studies of both controlled trials and observational studies; 55% (11/20) of conclusions based only on observational studies; and 67% (2/3) of conclusions based on cost-effectiveness studies (Table 1). For conclusions about women under 50, 40% (4/10) were favourable, 45% (10/22) were graded as favourable for women aged 50–69, 80% (4/5) for women aged 70 and over, and for women of all ages (or where age was not specified), 32% (7/22) were graded as favourable. Among the conclusions drawn in systematic reviews that used a meta-analysis, 25% (4/16) were graded as favourable, compared to 49% (21/43) of those with no meta-analysis (Fig. 3).
For the subset of conclusions relating to women aged 50–69 (corresponding to the age group for which mammography is recommended by several current guidelines), the 20 conclusions were produced by different authors and we were able to more reliably test the association between the observed characteristics and favourable conclusions (Table 2). In this subset, 27% (3/9) of conclusions from non-clinicians were favourable compared to 78% (7/9) from clinicians (OR 0.11; 95% CI 0.01–0.84; p = 0.025; χ 2 = 5.05). When corresponding authors declared no financial competing interests, 20% (2/10) of the conclusions were graded as favourable compared to 80% (8/10) when authors declared a financial competing interest or did not include a disclosure statement (OR 0.06; 95% CI 0.07–0.56; p = 0.007; χ 2 = 7.20). Analyses in other age groups did not indicate a significant difference (see Additional files 5, 7, and 8). We found no clear evidence that the types of primary studies included in the analysis, choice of outcomes, or the use of meta-analysis were associated with favourable conclusions.
As evidence accumulates, systematic reviews addressing the same clinical question should converge toward a common conclusion. In the last 15 years, 50 systematic reviews on the use of routine mammography for breast cancer screening in asymptomatic women have been published but a consistent conclusion has not emerged. While we did not find strong evidence to suggest that these differences were associated with the types of studies or outcomes included, or the use of meta-analysis, we did find that authors from certain professions were more likely to produce favourable conclusions for women aged between 50 and 69. There were too few conclusions published in systematic reviews for women under 50 and women 70 or older to reliably identify factors associated with favourable conclusions. The results suggest that the expertise and experience of the authors of systematic reviews may have influenced the conclusions in ways that could not be easily identified as differences in the designs of the reviews.
Author professions or specialties have also been associated with conclusions in primary studies and clinical practice guidelines for mammography. For clinical practice guidelines, both author specialty and competing interests were associated with a recommendation of routine screening . In an analysis of mostly primary studies, authors who worked in screening less often included over-diagnosis as an outcome or minimised it as a harm . We found similar results for systematic reviews on the same topic. Together, the results from these studies suggest that one of the reasons why we continue to see ongoing debate about the harms and benefits of mammography may be because the expertise and experiences of the researchers who report the evidence have influenced both the manner in which it is reported and the conclusions and recommendations that have been made.
Previous studies examining differences in systematic reviews on other topics have demonstrated associations between the characteristics of authors and the conclusions they have drawn. These studies have often examined financial and non-financial competing interests [10, 11, 15]. In each case, the analyses demonstrated that authors of systematic reviews may introduce biases in the design and reporting of systematic review to produce conclusions that are aligned with their interests. The introduction of biases into the design and reporting of systematic reviews may be subtle [16, 17], so it may be difficult to quantify where biases are introduced in the systematic review process.
There were several limitations to the analysis we performed. Non-systematic reviews were not considered in the analysis, and it is likely that reviewers with affiliations other than public health and epidemiology would be more commonly represented in non-systematic reviews. Similarly, we did not consider systematic reviews that were focused on the performance of mammography for diagnosis, and practitioners may have been more commonly represented in these systematic reviews. We did not use any formal tools—such as AMSTAR or the PRISMA checklist [18, 19]—to assess the quality of the reviews or the reporting of the reviews on mammography screening, which may have also been associated with differences in the conclusions. However, prior research has found that there were differences in quality among clinical practice guidelines for mammography . We did not investigate financial competing interests beyond what was disclosed by the authors—for some journals, the authors may not have been required to disclose their financial competing interests, which means that we may have under-reported the true rate of financial competing interests. Finally, because some authors were present in more than one review and some reviews included more than one conclusion, certain characteristics may have been over-represented in the sample, so we were limited in what we could conclude from the statistical analyses.
Analysing 15 years of systematic reviews examining the benefits and harms of mammography for routine breast cancer screening in asymptomatic women, we found that there is still no consensus across systematic reviews about when and how often mammography should be used. Despite the volume of evidence available for women aged 50 to 69, disagreements between guidelines about frequency are common. For this age group, the results suggest that favourable conclusions were more common in systematic reviews written by authors who were clinicians. The results were generally consistent with studies examining primary studies and clinical practice guidelines about mammography for routine breast cancer screening. The results provide further evidence to suggest that conclusions in systematic reviews may be influenced by the expertise and experiences of the authors who write them rather than when they were published, the evidence included, or the designs of the reviews.
Assessment of Multiple Systematic Reviews
National Health Service
Preferred Reporting Items for Systematic Reviews and Meta-Analyses
US Preventive Services Task Force
Perry N, Broeders M, de Wolf C, Tornberg S, Holland R, von Karsa L. European guidelines for quality assurance in breast cancer screening and diagnosis. Fourth edition—summary document. Ann Oncol. 2008;19(4):614–22. http://dx.doi.org/10.1093/annonc/mdm481.
World Health Organisation (WHO). WHO position paper on mammography screening. Switzerland: WHO; 2014.
The American College of Obstetricians and Gynecologists (ACOG). Practice bulletin no. 122: breast cancer screening. Obstet Gynecol. 2011;118(2):372–82.
Nattinger AB, Mitchell JL. Breast cancer screening and prevention. Ann Intern Med. 2016;164(11):81–96. http://dx.doi.org/10.1097/AOG.0b013e31822c98e5.
Caverly TJ, Hayward RA, Reamer E, Zikmund-Fisher BJ, Connochie D, Heisler M, et al. Presentation of benefits and harms in US cancer screening and prevention guidelines: systematic review. J Natl Cancer Inst. 2016;108(6):djv436. doi:10.1093/jnci/djv436.
U.S. Preventive Services Task Force. Screening for breast cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2009;151(10):716–26. doi:10.7326/0003-4819-151-10-200911170-00008.
The Independent UK Panel on Breast Cancer Screening. The benefits and harms of breast cancer screening: an independent review. Lancet. 2012;380(9855):1778–86. doi:10.1016/S0140-6736(12)61611-0.
Oeffinger KC, Fontham ET, Etzioni R, Herzig A, Michaelson JS, Shih YC, et al. Breast cancer screening for women at average risk: 2015 guideline update from the American Cancer Society. JAMA. 2015;314(15):1599–614. http://dx.doi.org/10.1001/jama.2015.12783.
Siu AL. Screening for breast cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2016;164(4):279–96. http://dx.doi.org/10.7326/M15-2886.
Dunn AG, Arachi D, Hudgins J, Tsafnat G, Coiera E, Bourgeois FT. Financial conflicts of interest and conclusions about neuraminidase inhibitors for influenza: an analysis of systematic reviews. Ann Intern Med. 2014;161(7):513–18. http://dx.doi.org/10.7326/M14-0933.
Bes-Rastrollo M, Schulze MB, Ruiz-Canela M, Martinez-Gonzalez MA. Financial conflicts of interest and reporting bias regarding the association between sugar-sweetened beverages and weight gain: a systematic review of systematic reviews. PLoS Med. 2013;10(12):e1001578. http://dx.doi.org/10.1371/journal.pmed.1001578.
Yank V, Rennie D, Bero LA. Financial ties and concordance between results and conclusions in meta-analyses: retrospective cohort study. BMJ. 2007;335(7631):1202–05. http://dx.doi.org/10.1136/bmj.39376.447211.BE.
Jorgensen KJ, Klahn A, Gotzsche PC. Are benefits and harms in mammography screening given equal attention in scientific articles? A cross-sectional study. BMC Med. 2007;5:12. doi:http://dx.doi.org/10.1186/1741-7015-5-12.
Norris SL, Burda BU, Holmer HK, Ogden LA, Fu R, Bero L, et al. Author’s specialty and conflicts of interest contribute to conflicting guidelines for screening mammography. J Clin Epidemiol. 2012;65(7):725–33. http://dx.doi.org/10.1016/j.jclinepi.2011.12.011.
Lieb K, Osten-Sacken J, Stoffers-Winterling J, Reiss N, Barth J. Conflicts of interest and spin in reviews of psychological therapies: a systematic review. BMJ Open. 2016;6(4):e010606. doi:10.1136/bmjopen-2015-010606.
Bero L. Industry sponsorship and research outcome: a cochrane review. JAMA Intern Med. 2013;173(7):580–1. http://dx.doi.org/10.1001/jamainternmed.2013.4190.
Lundh A, Sismondo S, Lexchin J, Busuioc OA, Bero L. Industry sponsorship and research outcome. Cochrane Database Syst Rev. 2012;12:MR000033. http://dx.doi.org/10.1002/14651858.MR000033.pub2.
Shea BJ, Grimshaw JM, Wells GA, Boers M, Andersson N, Hamel C, et al. Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews. BMC Med Res Methodol. 2007;7(1):1. http://dx.doi.org/10.1186/1471-2288-7-10.
Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151(4):264–69. http://dx.doi.org/10.7326/0003-4819-151-4-200908180-00135.
Burda BU, Norris SL, Holmer HK, Ogden LA, Smith ME. Quality varies across clinical practice guidelines for mammography screening in women aged 40-49 years as assessed by AGREE and AMSTAR instruments. J Clin Epidemiol. 2011;64(9):968–76. http://dx.doi.org/10.1016/j.jclinepi.2010.12.005.
MSO is supported by a grant from the National Health and Medical Research Council, Australia (APP1052871), and the National Institutes of Health (T15LM007092).
Availability of data and materials
All data generated or analysed during this study are included in this published article and its additional files.
SR designed the study, prepared and analysed the data, drafted the manuscript, and revised the manuscript. AGD designed the study, analysed the data, drafted the manuscript, and revised the manuscript. MSO contributed to the design of the study, supported the data analysis, provided domain expertise, and revised the manuscript. FTB contributed to the design of the study and revised the manuscript. EC contributed to the design of the study and revised the manuscript. KDM designed the study, drafted the manuscript, and revised the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Age group ranges. Age group ranges identified in the systematic reviews and their classification in the analysis. Shaded regions represent the age groups used in the analysis (orange rectangles) and the set of all age groups identified in the systematic reviews are labelled by their reference number and age range (grey rectangles). (PDF 374 kb)
Authors’ professional roles; (C: clinican; N: non-clinician; G: group; triangles indicate the corresponding author). (PDF 602 kb)
Systematic review conclusion and recommendation statements. (PDF 433 kb)
Included systematic reviews. References for all included systematic reviews. (PDF 229 kb)
Associations between systematic review characteristics and conclusions in 5 conclusions of studies that included women aged 70 years and older. (PDF 193 kb)
Included systematic review conclusions in detail. (PDF 774 kb)
Associations between systematic review characteristics and conclusions in 22 conclusions of studies that did not specify age group or specified all ages. (PDF 193 kb)
Associations between systematic review characteristics and conclusions in 10 conclusions of studies that included women aged up to 49 years. (PDF 193 kb)