Skip to main content

Table 2 Different performance measures for the machine-assisted screening approach, single-reviewer screening, and screening with DistillerAI alone

From: Assessing the accuracy of machine-assisted abstract screening with DistillerAI: a user study

 

Sensitivity

(95% CI)

Specificity

(95% CI)

Area under the curve (95% CI)

N of missed studies (proportion)

N of included abstracts (proportion)

N of conflicts (proportion)

N of included studies in training set

Team 1

Machine-assisted screening

0.78 (0.59 to 0.90)

0.96 (0.96 to 0.97)

0.87 (0.80 to 0.95)

7/32 (22%)

97/2172 (4%)

126/2172 (6%)

10/300

Single-reviewer screening

0.78 (0.59 to 0.90)

0.96 (0.95 to 0.97)

0.87 (0.80 to 0.94)

7/32 (22%)

110/2172 (5%)

DistillerAI screening

0.03 (0.00 to 0.21)

0.99 (0.98 to 0.99)

0.51 (0.48 to 0.54)

31/32 (97%)

27/2172 (1%)

Team 2

Machine-assisted screening

0.89 (0.70 to 0.97)

0.92 (0.91 to 0.93)

0.90 (0.84 to 0.96)

3/27 (11%)

232 /2172 (11%)

226/2172 (10%)

15/300

Single-reviewer screening

0.89 (0.69 to 0.97)

0.91 (0.89 to 0.92)

0.90 (0.84 to 0.96)

3/27 (11%)

221/2172 (10%)

DistillerAI screening

0.00

0.99 (0.99 to 0.99)

0.50 (0.49 to 0.50)

27/27 (100%)

18/2172 (1%)

Team 3

Machine-assisted screening

0.65 (0.44 to 0.82)

0.96 (0.95 to 0.97)

0.81 (0.71 to 0.90)

9/26 (35%)

130/2172 (6%)

100/2172 (5%)

16/300

Single-reviewer screening

0.65 (0.44 to 0.82)

0.96 (0.95 to 0.97)

0.81 (0.71 to 0.90)

9/26 (35%)

104/2172 (5%)

DistillerAI screening

0.23 (0.10 to 0.44)

0.99 (0.98 to 0.99)

0.61 (0.53 to 0.69)

20/26 (77%)

30/2172 (1%)

Team 4

Machine-assisted screening

0.86 (0.66 to 0.95)

0.94 (0.93 to 0.95)

0.90 (0.83 to 0.96)

4/28 (14%)

199/2172 (9%)

194/2172 (9%)

14/300

Single-reviewer screening

0.82 (0.62 to 0.93)

0.93 (0.92 to 0.94)

0.88 (0.80 to 0.95)

5/28 (18%)

165/2172 (8%)

DistillerAI screening

0.32 (0.17 to 0.52)

0.97 (0.96 to 0.98)

0.65 (0.56 to 0.73)

19/28 (68%)

69/2172 (3%)

Team 5

Machine-assisted screening

0.74 (0.55 to 0.87)

0.95 (0.94 to 0.96)

0.84 (0.77 to 0.92)

8/31 (26%)

187/2172 (9%)

181/2172 (8%)

11/300

Single-reviewer screening

0.74 (0.55 to 0.87)

0.95 (0.94 to 0.95)

0.84 (0.77 to 0.92)

8/31 (26%)

138/2172 (6%)

DistillerAI screening

0.13 (0.05 to 0.31)

0.97 (0.96 to 0.98)

0.55 (0.49 to 0.61)

27/31 (87%)

65/2172 (3%)

Combined

Machine-assisted screening

0.78 (0.66 to 0.90)

0.95 (0.92 to 0.97)

0.87 (0.83 to 0.90)

6/30 (22%)

8%

165/2172 (8%)

13/300

Single-reviewer screening

0.78 (0.66 to 0.89)

0.94 (0.91 to 0.97)

0.86 (0.82 to 0.89)

6/30 (22%)

7%

DistillerAI screening

0.14 (0.00 to 0.31)

0.98 (0.97 to 1.00)

0.56 (0.53 to 0.59)

25/30 (86%)

2%

  1. CI = confidence interval; N = number