Skip to main content

Table 2 Model metrics for the internal and external validation datasets

From: Machine learning algorithms to identify cluster randomized trials from MEDLINE and EMBASE

Dataset

AUC, %

(95% CI)

True positive rate sensitivity, %

(95% CI)

False positive rate

1-specificity, %

(95% CI)

Number needed to screen

(95% CI)

Internal validation

This dataset had 600 articles, with ~ 15% being CRTs

Number needed to read: 6.8a

 Convolutional neural network—Word2Vec

98.2 (96.9, 99.5)

96.6 (92.0, 100)

13.9 (10.7, 17.0)

1.8 (1.6, 2.1)

 Convolutional neural network—FastText

98.4 (97.3, 99.5)

89.8 (83.0, 96.6)

3.5 (2.0, 5.1)

1.2 (1.1, 1.3)

 Support vector machines

97.2 (95.7, 98.8)

97.7 (94.3, 100)

19.9 (16.4, 23.2)

2.2 (1.9, 2.6)

 Ensemble

98.6 (97.8, 99.4)

97.7 (94.3, 100)

15.0 (11.9, 18.2)

1.9 (1.7, 2.2)

External validation

This dataset had 1916 articles, with ~ 35% being CRTs

Number needed to read: 2.9a

 Convolutional neural network—Word2Vec

97.9 (97.2, 98.6)

97.0 (95.6, 98.2)

20.8 (18.5, 23.0)

1.4 (1.3, 1.5)

 Convolutional neural network—FastText

97.7 (97.0, 98.4)

91.7 (89.8, 93.8)

4.8 (3.7, 6.0)

1.1 (1.1, 1.1)

 Support vector machines

96.8 (96.0, 97.6)

97.3 (96.1, 98.5)

32.2 (29.7, 34.9)

1.6 (1.6, 1.7)

 Ensemble

97.8 (97.0, 98.5)

97.6 (96.4, 98.6)

21.8 (19.6, 24.1)

1.4 (1.4, 1.5)

  1. aThe number needed to read was calculated as one divided by the % of articles that are CRTs
  2. AUC Area under the receiver operating characteristic curve, CI Confidence interval