Skip to main content

Table 2 Agreement for all checklist items

From: A checklist designed to aid consistency and reproducibility of GRADE assessments: development and pilot validation

Item

Kappa (95% CI)

Magnitude of agreement

Risk of bias

  

 Was random sequence generation used (i.e. no potential for selection bias)?

0.89 (0.69 to 1)

Almost perfect

 Was allocation concealment used (i.e. no potential for selection bias)?

0.69 (0.29 to 1)

Substantial

 Was there blinding of participants and personnel (i.e. no potential for performance bias)?

0.71 (0.41 to 1)

Substantial

 Was there blinding of outcome assessment (i.e. no potential for detection bias)?

0.98 (0.67 to 1)

Almost perfect

 Was an objective outcome used?

1

Almost perfect

 Were more than (80%)a of participants enrolled in trials included in the analysis? (i.e. no potential attrition bias)

0.44 (0.07 to 0.81)

Moderate

 Were data reported consistently for the outcome of interest (i.e. no potential selective reporting)? (no potential reporting bias)

0.25 (0 to 0.61)

Fair

 No other biases reported? (no potential of other bias)

0.20 (0 to 0.62)

Slight

 Did the trials end as scheduled (i.e. not stopped early)?

1

Almost perfect

Inconsistency

  

 Point estimates did not vary widely? (i.e. no clinical meaningful inconsistency)

0.65 (0.37 to 0.93)

Substantial

 To what extent do confidence intervals overlap?

0.50 (0.17 to 0.77)

Moderate

 Was the direction of effect consistent?

1

Almost perfect

 What was the magnitude of statistical heterogeneity (as measured by I 2)?

1

Almost perfect

 Was the test for heterogeneity statistically significant (p < 0.1)?

1

Almost perfect

Indirectness

  

 Were the populations in included studies applicable to the target population?

Below chance

Poor

 Were the interventions in included studies applicable to target intervention?

Below chance

Poor

 Was the included outcome not a surrogate outcome?

1

Almost perfect

 Was the outcome timeframe sufficient?

0.47 (0 to 1)

Moderate

 Were the conclusions based on direct comparisons?

1

Almost perfect

Imprecision

  

 Was the confidence interval for the pooled estimate not consistent with benefit and harm?

1

Almost perfect

 What was the magnitude of the median sample size?

1

Almost perfect

 What was the magnitude of the number of included studies?

1

Almost perfect

 Was the outcome a common event? (e.g. occurs more than 1/100)a

1

Almost perfect

 Was there no evidence of serious harm associated with treatment?

0.89 (0.67 to 1)

Almost perfect

Publication bias

  

 Did the authors conduct a comprehensive search?

0.65 (0 to 1)

Substantial

 Did the authors search for grey literature?

0.26 (0 to 0.67)

Fair

 Authors did not apply restrictions to study selection on the basis of language?

0.74 (0.45 to 1)

Substantial

 There was no industry influence on studies included in the review?

0.71 (0.45 to 0.98)

Substantial

 There was no evidence of funnel plot asymmetry?

0.62 (0.35 to 0.89)

Substantial

 There was no discrepancy in findings between published and unpublished trials?

1

Almost perfect

  1. aThese thresholds can be replaced with different ones based on the context of the particular review.