Skip to main content

Table 2 Agreement for all checklist items

From: A checklist designed to aid consistency and reproducibility of GRADE assessments: development and pilot validation

Item Kappa (95% CI) Magnitude of agreement
Risk of bias   
 Was random sequence generation used (i.e. no potential for selection bias)? 0.89 (0.69 to 1) Almost perfect
 Was allocation concealment used (i.e. no potential for selection bias)? 0.69 (0.29 to 1) Substantial
 Was there blinding of participants and personnel (i.e. no potential for performance bias)? 0.71 (0.41 to 1) Substantial
 Was there blinding of outcome assessment (i.e. no potential for detection bias)? 0.98 (0.67 to 1) Almost perfect
 Was an objective outcome used? 1 Almost perfect
 Were more than (80%)a of participants enrolled in trials included in the analysis? (i.e. no potential attrition bias) 0.44 (0.07 to 0.81) Moderate
 Were data reported consistently for the outcome of interest (i.e. no potential selective reporting)? (no potential reporting bias) 0.25 (0 to 0.61) Fair
 No other biases reported? (no potential of other bias) 0.20 (0 to 0.62) Slight
 Did the trials end as scheduled (i.e. not stopped early)? 1 Almost perfect
Inconsistency   
 Point estimates did not vary widely? (i.e. no clinical meaningful inconsistency) 0.65 (0.37 to 0.93) Substantial
 To what extent do confidence intervals overlap? 0.50 (0.17 to 0.77) Moderate
 Was the direction of effect consistent? 1 Almost perfect
 What was the magnitude of statistical heterogeneity (as measured by I 2)? 1 Almost perfect
 Was the test for heterogeneity statistically significant (p < 0.1)? 1 Almost perfect
Indirectness   
 Were the populations in included studies applicable to the target population? Below chance Poor
 Were the interventions in included studies applicable to target intervention? Below chance Poor
 Was the included outcome not a surrogate outcome? 1 Almost perfect
 Was the outcome timeframe sufficient? 0.47 (0 to 1) Moderate
 Were the conclusions based on direct comparisons? 1 Almost perfect
Imprecision   
 Was the confidence interval for the pooled estimate not consistent with benefit and harm? 1 Almost perfect
 What was the magnitude of the median sample size? 1 Almost perfect
 What was the magnitude of the number of included studies? 1 Almost perfect
 Was the outcome a common event? (e.g. occurs more than 1/100)a 1 Almost perfect
 Was there no evidence of serious harm associated with treatment? 0.89 (0.67 to 1) Almost perfect
Publication bias   
 Did the authors conduct a comprehensive search? 0.65 (0 to 1) Substantial
 Did the authors search for grey literature? 0.26 (0 to 0.67) Fair
 Authors did not apply restrictions to study selection on the basis of language? 0.74 (0.45 to 1) Substantial
 There was no industry influence on studies included in the review? 0.71 (0.45 to 0.98) Substantial
 There was no evidence of funnel plot asymmetry? 0.62 (0.35 to 0.89) Substantial
 There was no discrepancy in findings between published and unpublished trials? 1 Almost perfect
  1. aThese thresholds can be replaced with different ones based on the context of the particular review.