Skip to main content

Table 2 The estimated number of unknown duplicates based on a random sample from each title similarity score range. Confidence intervals for the percentage of hidden duplicates based on the exact binomial confidence interval for the proportion of duplicates in the sample

From: Previously unidentified duplicate registrations of clinical trials: an exploratory analysis of registry data worldwide

Score range

D. in sample

D. known

D. unknown (est.)

% hidden

0.7<x≤0.8

7 / 125 (5.6 %)

2194

1957

47 (26–64)

0.8<x≤0.9

13 / 100 (13 %)

3489

2265

39 (26–51)

0.9<x≤1.0

89 / 209 (43 %)

5805

5393

48 (44–52)