Skip to main content

Table 3 Summary of the perceived key methodological challenges associated with each of the exemplar overviews, a description of what the challenge was, and examples of how this challenge was dealt with within individual overviews

From: Selecting and implementing overview methods: implications from five exemplar overviews

Key methodological challenges. Dealing with: Description of challenge and why it was experienced Complementarity with published literature on overview methods (Ballard [3]) Examples of how this challenge was dealt with within our exemplar overviews
a. Overlap between reviews (studies appearing in more than one review) In overviews which include both Cochrane and non-Cochrane reviews, multiple published reviews were often identified which had similar aims and which included the same or similar trials. Summary of findings from Ballard [3]: Pollock [19] extracted details of the trials included within all relevant reviews, and explored which trials were included within which review. Reviews which were effectively superseded by a more comprehensive review of the same topic were excluded; the methods for making this judgement are described within the review [19]. 37 reviews which met the inclusion criteria but which were judged to be superseded by more up-to-date, comprehensive and methodologically rigorous reviews were excluded. These exclusions are transparently reported, and a table provided with characteristics of excluded reviews, including details of key characteristics as reported for included reviews (see Table 2).
• “emerging debate related to (i) overlapping systematic reviews”
• Approaches to dealing with this challenge include:
o Calculation of degree of overlap using the “corrected cover area” (Pieper [2])
o Use “a priori criteria for choosing a single systematic review for inclusion when multiple potential candidates are available ”
o Use only Cochrane reviews (avoid overlap)
In overviews which included only Cochrane reviews, some trials were found to be included in more than one review. This was particularly the case for 3 or 4 arm trials, where different arms of the trial were included in different reviews. • Optimal approach “currently remains unresolved.”
Hunt [22] identified a number of studies which were included in more than one systematic review, but did not exclude any systematic reviews on the basis of overlap. Where updated systematic reviews existed, the most recent review was selected over previous versions. Due to substantial heterogeneity across reviews, no meta-analysis or statistical synthesis was conducted and double counting of participants was not a risk. Hunt [22] explored the consistency of reporting of individual trial results within multiple reviews, and clearly highlighted any discrepancies which occurred.
Judgement of complementarity
• A range of different approaches were used, and there was no consensus on an
• optimal approach
Silence (not raised by Ballard [3]):
• Two of our reviews (Hunt [22], McClurg [21]) included reviews which had overlapping trials, highlighting and reporting the overlap, but not taking further action
McClurg [21] only included Cochrane reviews, so the occurrences of overlapping reviews were reduced as compared to Pollock [19] or Hunt [22]. However there were still a number of trials identified within the reviews included in McClurg [21] which were included within two or more reviews. Details of the individual trials included in each review were extracted and occurrences of overlap systematically identified. Where a trial was found to occur in more than one review, details of that individual trial were obtained, and the existence of multiple treatment arms explored. In all occurrences of overlap the reviews were found to contain unique data from different active treatment arms, although control or comparison group data were often found to be included within more than one review. Details of these overlaps were documented and any times were there was multiple use of the same control group data were highlighted.
If studies appear in more than one review then there are risks of double counting (where results of individual studies are included more than once within meta-analysis). This could be meta-analysis completed by overview authors, or completed by review authors and synthesised within the overview.
• McClurg 2016 [21] which only included Cochrane reviews still identified a problem with overlap – this conflicts with suggestions that inclusion of Cochrane reviews only is a method to avoid problems of overlap.
• Brunton 2016 [23] did not assess the overlap, stating that this was not important to the stated purpose of the overview.
Brunton 2016 [23] did not assess overlap between reviews; however this overview was designed as a ‘map of reviews’ rather than a full systematic review of reviews. The main purpose was to identify the characteristics that had been shown to be associated with larger effect sizes in order to compare those characteristics with those identified by stakeholder research and policy documents.
b. Reviews are out of date Included reviews were judged to be out of date. Summary of findings from Ballard [3]: Estcourt [20] updated a number of reviews that were included. A judgement as to the relevance of each included reviews was applied, and updating was focused on only the reviews known to have included or ongoing trials. Potentially relevant RCTs that may require a review to be updated were identified using a broad search of the Transfusion Evidence Library (; a database, updated monthly, of systematic reviews (since 1980) and randomised controlled trials (since 1950) relevant to transfusion medicine, including grey literature. This database is officially endorsed by Cochrane). Estcourt [20] also searched the WHO Clinical trials database and for relevant ongoing trials using a broad search strategy, with search terms: ‘sickle cell disease’ or ‘sickle cell an*emia’ and ‘transfusions’ or ‘blood transfusions’ or ‘red cell transfusions’. The decision was made that the overview authors would update most of the relevant reviews, this included reviews of red cell transfusions to prevent: development or progression of chronic lung disease [44]; sickle cell disease-related complications when having an operation [45]; prevention of a primary or secondary stroke [46]. Estcourt [20] also performed several new reviews on topics that did not have a Cochrane review, these included: prevention of chronic kidney disease [47]; and prevention of silent cerebral infarcts [48] [44,45,46] as well as performing the new reviews [47, 48]. The authors of the Cochrane review [49] agreed to update this. Another review in acute chest syndrome was already being updated [50], and the authors of another review agree to include red cell transfusion as an intervention when it is next updated but it was known that there are no relevant RCTs within that review [51].
• “emerging debate related to (iv) updating included systematic reviews”
• Some guidelines ignore this issue, but approaches to deal with the challenge include:
o Updating included systematic reviews
o Searching for secondary and primary literature simultaneously
• Both approaches add complexity and time to the overview process.
• There is currently “no way to systematically investigate whether an update in the context of overviews is necessary .”
Judgement of complementarity
• Estcourt [20] updated relevant reviews.
Other overview authors ignored this issue, only raising it during discussion.
c. Definition of “systematic review” Published review papers were identified which were described as “literature reviews” but did not meet expected methodological standards to be classed as a “systematic review” Summary of findings from Ballard [3]: Overview authors applied clear definitions of what a systematic review was during the stage of selecting relevant reviews of inclusion. Brunton [23] required that to be included a review had to describe (at a minimum) the search strategy, inclusion criteria and quality assessment methods.
• Ballard [3] does not specifically identify the issue of the definition of a systematic review.
Judgement of complementarity
Silence (not raised by Ballard [3])
This challenge is raised by the overview authors but not identified in literature synthesised by Ballard [3].
d. Assessment of methodological quality of reviews Assessment of methodological quality using the AMSTAR was found to be challenging due to the multi-faceted nature of the questions within the AMSTAR tool. Assessment of methodological quality using the ROBIS was found to be challenging, with difficulties in reaching agreement between overview authors. There was no suitable tool available for assessing the methodological quality of reviews of diagnostic test accuracy. Summary of findings from Ballard [3]: The AMSTAR [42] and ROBIS [14] were the only two quality assessment tools used by our exemplar overviews:
• “emerging debate related to (iii) evaluating the quality and reporting of included research”
• (systematic reviews) Pollock [19] developed and implemented a modified version of the AMSTAR.
• There is no consensus on what instrument should be used to assess methodological quality of systematic reviews
Estcourt [20] used AMSTAR but added explanatory text to explain reason authors made decision.
• Many overviews don’t assess methodological quality of systematic reviews (Hartling [1]) and there are a “diversity of instruments” used
McClurg [21] used ROBIS, providing explanatory text to explain reason authors made decision. Independent authors within the McClurg [21] overview experienced difficulties interpreting the signalling questions used to prompt judgements relating to the four domains of phase 2 of the ROBIS tool, which led to high levels of disagreement and the need for substantial discussion, and the involvement of an arbitrator. During discussion amongst three overview authors there were several areas of continued disagreement, and this contributed to a post-hoc decision not to report a final overall judgement of risk of bias for each review [21].
Judgement of complementarity
• A range of tools were used.
Silence (not raised by Ballard [3])
• Challenges in gaining agreement in ROBIS judgements between review authors were identified by overview authors.
Hunt [22] suggests modifications to the AMSTAR and ROBIS tools in order to fit within a diagnostic test accuracy review framework.
e. Quality of reporting within reviews Methodological assessments, using either the AMSTAR or ROBIS, were limited due to the quality of reporting of the reviews. It was therefore challenging to determine whether the scores provided by AMSTAR or ROBIS reflected the quality of the methods or quality of the reporting. Summary of findings from Ballard [3]: Pollock [19] changed the last question of the AMSTAR, so that it was a judgement on quality rather than on presence of information.
• “emerging debate related to (iii) evaluating the quality and reporting of included research” (systematic reviews)
• Important to differentiate between methodological quality and reporting quality.
McClurg [21] and Estcourt [20] added explanatory text to justify decisions made by authors.
Judgement of complementarity
Agreement Hunt [22] reported that the quality of reporting in DTA studies was generally found to be poor, and this was observed in this overview where quality of reporting was found to be mixed. Cochrane DTA reviews were of a higher quality than non-Cochrane DTA reviews.
• Overview authors attempted to differentiate between methodological quality and reporting quality.
Silence (not raised by exemplar overviews)
• To address reporting issues Ballard [3] “recommended that PRISMA be used in conjunction with a comprehensive, validated critical appraisal tool.”
f. Applying GRADE The GRADE approach has been developed specifically for judgements of quality of evidence during guideline development, and also adopted for judgement of quality of evidence within Cochrane reviews [33]. There is an absence of guidance on how to apply GRADE within an overview. Authors using GRADE faced challenges relating to the number of comparisons, and subtle differences between comparisons, which created issues in terms of workload and in relation to achievement consistency. Summary of findings from Ballard [3]: Pollock [19] identified challenges in consistent application of the GRADE approach to large volumes of evidence synthesised within overviews [37], and proposedla more algorithmic approach to judging quality of evidence within reviews [10]. There remains debate about the validity of this proposed approach [11, 12], but Pollock [19] argues that approach does arguably facilitate transparency and consistency when faced with judging the quality of evidence of many similar (but not identical) comparisons included within reviews.
• “emerging debate related to (iii) evaluating the quality and reporting of included research” (“quality of the body of evidence across included systematic reviews”)
• GRADE has been described as an approach for assessing the quality of the body of evidence accrsoss systematic reviews, but there is currently a lack of guidance to ensure appropriate use and interpretation of GRADE when applied in this way
McClurg [21] repeated this method and developed algorithm using the method recommended by Pollock [19], but involving a wider group of people in the decision making, including statisticians and clinicians.
Judgement of complementarity
Agreement Estcourt [20] used GRADE levels of evidence from within included reviews. All new and updated reviews that included relevant comparisons had performed GRADE assessments. The participants within the reviews differed (pregnancy, preoperative, high risk of stroke etc.). Therefore, individual GRADE assessments of an outcome had to be reported for a particular type of participant.
• Overview authors used GRADE to explore quality of evidence, identifying absence of guidance of how to apply this within the context of an overview.
g. Potential for publication bias The potential for publication bias comes from two sources – publication bias relating to the identification and inclusion of relevant reviews, and publication bias relating to the trials which are identified and included within the reviews. Summary of findings from Ballard [3]: Other than being transparent about the potential risks of publication bias, no additional action was undertaken within our overviews. Hunt [22] highlighted particular challenges relating to the exploration of publication bias relating to reviews of diagnostic test accuracy and due to lack of consensus in handling publication bias within DTA reviews this was not assessed (as pre-stated in the protocol).
• While the issue of publication/reporting bias is not explicitly raised as a methodological challenge within guidance on overview methods, Ballard [3] concludes that “overviews are always susceptible to….and reporting biases”
• Where systematic reviews are at high risk of reporting biases then a systematic review (rather than overview) may produce the most precise result
Judgement of complementarity
The issue of publication bias is raised by both Ballard [3] and our exemplar overviews, although no specific actions to address or alleviate these biases are identified.
h. Summarising key findings in brief accessible format, suitable for informing decision making A brief, often single page, summary has insufficient space to address the factors that decision makers require to inform clinical or policy decisions. Decision makers require details of what works, with whom and in what way. Summary of findings from Ballard [3]: Pollock [19] and McClurg [21] highlight that the aim of the overview is to signpost clinical decision makers to the reviews which are best placed to inform decisions, and clearly state that the overview does not provide sufficient evidence to make individual treatment decisions. Pollock [19] produced a single page summary table, incorporating a traffic light system to indicate evidence of effectiveness, summarising interventions for which there is evidence of effectiveness in relation to specific outcomes, and stating the quality of that evidence.
• “emerging debate related to (v) synthesising and reporting the result of included systematic reviews”
• Functions of an overview can be to explore heterogeneity or summarise evidence.
Brunton [23] integrated the results of an overview of reviews with a mixed method synthesis relating to stakeholder views and key policy documents. A single page structured summary was produced confirming interventions which are supported by evidence, and stating that specific characteristics relating to the intervention implementation may be important. .
Judgement of complementarity
• Our overviews aimed to summarise evidence. Hunt [22] is conducting additional analysis of outcomes, subsequent to completion of the overview of reviews of diagnostic test accuracy, in order to more fully inform decision making.
Within protocols, McClurg 2016 [21] proposed use of a matrix to summarise research findings and Estcourt [20] proposes the use of the template provided within the Cochrane Handbook for summarising findings within overview (Additional file 1).