Table 7 Characteristics of SRs of methods studies and assessment of risk of bias

From: Toward a comprehensive evidence map of overview of systematic review methods: paper 2—risk of bias assessment; synthesis, presentation and summary of the findings; and assessment of the certainty of the evidence

  Study ID (first author, year)
Pieper 2014a [17] Whiting 2013 [12]
Characteristics of the studies
 Title Systematic review found AMSTAR, but not R(evised)-AMSTAR, to have good measurement properties Review of existing quality assessment tools for systematic reviews (Chapter 4)
 Primary objective To review all empirical studies evaluating the measurement properties of AMSTAR and R-AMSTAR To conduct a review of existing tools designed to critically appraise SRs and meta-analyses.
The review was conducted to inform development of ROBIS
 Number of included tools 2 40 (5/40 tools targeted areas other than SRs of interventions, for example diagnostic test accuracy or genetic association studies)
 Number of studies reporting on the included tools 13 (10 reporting on AMSTAR, 2 on R-AMSTAR, 1 on both)
• 4/13 studies had a primary objective to assess the properties of AMSTAR/R-AMSTAR
• 9/13 were methods studies that applied AMSTAR/R-AMSTAR (mainly assessing quality of SRs in a clinical area)
 Name of the included tools or measures (unnamed tools are identified by first author name and year of publication) AMSTAR [22, 23], R-AMSTAR [36] Named tools: AMSTAR [22, 23], CASP [83], FOCUS [84], MAC [85], NHMRC [86], OQAQ [28], SIGN [87], RAPiD [88]a
Unnamed tools: Assendelft 1995 [89], Auperin 1997 [90], Crombie 1996 [91], Geller 1996 [92], Glenny 2003 [93], Greenhalgh 1997 [94], Higgins 2013 [38], Ho 2010 [95], Irwig 1994 [96], Knox 2009 [97], Li 2012 [98], Light 1984 [99], Lundh 2012 [100], Mailis 2012 [101], Minelli 2009 [102], Mokkink 2009 [14], Mulrow 1987 [103], Nony 1995 [104], Oxman 1988 [105], Oxman 1994 [106], Oxman 1994 [107] (3 tools), Philibert 2012 [108], Sacks 1997 [109], Santaguida 2012 [110], Shamliyan 2010 [111], Sheikh 2007 [112], Smith 1989 [12]; Smith 1997 [113], Smith 2007 [12], Thacker 1996 [114], Wilson 1992 [115], Zambon 2012 [116]
 Content validity–reported method of development (e.g. item generation, expert assessment of content) Not assessed (noted in background that AMSTAR was based on OQAQ and a checklist by Sacks 1997) Methods of development were reported for 17/40 tools:
• 3 tools were developed using a ‘rigorous’ process (AMSTAR, Higgins 2013, OQAQ)§
• 10 tools were based on multiple existing tools and/or guidelines for the conduct of systematic reviews (or similar)
• 4 tools were adapted from a single tool
§OQAQ was based on literature review, survey of methodological experts, and pretesting (pilot study).
AMSTAR was based on existing tools (including OQAQ), a consensus process aimed at establishing face and content validity, and exploratory factor analysis.
Higgins 2013 was based on AMSTAR, the Cochrane Handbook for Systematic Reviews of Interventions [67], expert review of items, and pilot testing.
 Reliability—description of reliability testing Inter-rater reliability (IRR) assessments were reported in 11/13 studies, (9 on AMSTAR, 2 on R-AMSTAR). IRR results were reported for individual items (8 studies), the mean across all items (7 studies), and overall score (6 studies) Inter-rater reliability assessments were reported for 5/40 tools (most reporting kappa or intraclass correlation coefficient)
 Tests of validity—description of correlation coefficient testing Six studies assessed construct validity examining the correlation between total AMSTAR scale scores (summing ‘yes’ responses) and scores on OQAQ (3 studies), Sack’s list (1 study), R-AMSTAR (1 study), and expert assessment (2 studies) No tests of validity were reported for any tools (although exploratory factor analysis was used during development of content for AMSTAR)
 Other assessments (feasibility, acceptability, piloting) Time taken to complete tool The SR includes a summary of tool content (items and domains measured), tool structure (e.g. checklist, domain based), and item rating (i.e. response options)
Risk of bias in the SRs of methods studies
 Domain 1—study eligibility criteriab Low
Unclear if predefined criteria/objectives were adhered to, but eligibility criteria are broad (lessening inappropriate exclusions), unambiguous and appropriate.
Unclear if predefined criteria/objectives were adhered to, but eligibility criteria are broad (lessening inappropriate exclusions), unambiguous and appropriate.
 Domain 2—identification and selection of studiesb Low
Comprehensive search of multiple databases and reference lists. While search terms are not reported in full, the authors searched for evaluations of specific tools, terms for which were likely to be reported in the abstract. Independent screening of citations and full text by two authors.
Comprehensive search. Independent screening of citations, single screening of full text with checks.
 Domain 3—data collection and study appraisalb High
Single data extraction, with checks. COSMIN [13] was used to defined measurement properties and as a guide to interpreting findings, but not to appraise study methods. There is potential that the methods used for inter-rater reliability assessment may bias estimates of reliability; given this and the extent of reporting of reliability statistics, concern for this domain was rated as high.
Single data extraction, with checks. Most potential for error in extracting and classifying content of items, however the impact of misclassification is low. No assessments of risk of bias of included studies, but this is only a concern for studies that reported estimates of measurement properties (5/40 studies reported reliability statistics). Since interpretation of results focused on tool development and content, concern for this domain was rated as low.
 Overall judgementc Low risk of bias
Although there is potential for bias in the reported estimates of reliability and validity, the authors were cautious in their interpretation, and noted the limitations of both the evaluations reported in included studies and their review methods.
Low risk of bias
  1. AMSTAR A Measurement Tool to Assess Systematic Reviews; CASP Critical Appraisal Skills Programme; IRR Inter-rater reliability; OQAQ Overview Quality Assessment Questionnaire; MAC Meta-analysis Appraisal Checklist; NHMRC National Health and Medical Research Council; RAPiD Rapid Appraisal Protocol internet Database; ROBIS Risk of Bias In Systematic reviews; SIGN Scottish Intercollegiate Guidelines Network; SRs systematic reviews
  2. aOQAQ [28] is also referred to as OQAC (Overview Quality Assessment Checklist), and RAPiD [88] is also referred to as RAP (Rapid Appraisal Protocol).
  3. bLevel of concern for each domain judged as low, high or unclear
  4. cOverall judgement is based on: interpretation address all concerns identified in domains 1–3, relevance of studies was appropriately considered, reviewers avoided emphasising results based on statistical significance