Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records

Table 2 ATD values (as a percentage \(\bar{x} (\hat{s})\)) for all model-dataset combinations. For every dataset, the best results are in bold. Median (MAD) is given for all datasets

	Nudging	PTSD	Software	ACE	Virus	Wilson
SVM + TF-IDF	10.1 (0.18)	2.1 (0.13)	1.9 (0.04)	7.1 (1.15)	8.5 (0.17)	4.0 (0.32)
NB + TF-IDF	9.3 (0.29)	1.7 (0.11)	1.4 (0.03)	4.9 (0.51)	8.2 (0.22)	3.9 (0.35)
RF + TF-IDF	11.7 (0.44)	3.3 (0.26)	2.0 (0.09)	6.8 (0.74)	10.5 (0.42)	5.6 (1.15)
LR + TF-IDF	9.5 (0.19)	1.7 (0.10)	1.4 (0.01)	5.9 (1.17)	8.3 (0.24)	4.3 (0.32)
SVM + D2V	8.8 (0.33)	2.1 (0.15)	1.4 (0.05)	6.1 (0.33)	8.4 (0.21)	4.5 (0.30)
RF + D2V	10.3 (0.87)	3.0 (0.33)	1.6 (0.09)	7.2 (1.26)	9.2 (0.43)	7.2 (1.49)
LR + D2V	8.8 (0.47)	1.9 (0.16)	1.4 (0.04)	5.4 (0.18)	8.3 (0.40)	4.7 (0.30)
Median (MAD)	9.5 (1.05)	2.1 (0.48)	1.4 (0.12)	6.1 (1.11)	8.4 (0.18)	4.5 (0.64)

ISSN: 2046-4053