Fig. 1From: Machine learning algorithms to identify cluster randomized trials from MEDLINE and EMBASEScatter text visualization of words and phrases used in our dataset. Points are colored blue or red based on related terms with cluster randomized trials (CRT) or non-CRT citations. The dataset consisted of 589 CRT (111,492 words) and 4411 non-CRT citations (816,167 words). The terms associated with each category are under “top CRT” and “top non-CRT” headings. Interactive version of the figure: (interactive Fig. 1 https://mlscreener.s3.ca-central-1.amazonaws.com/Scatter_plot1.html) (Note: The file size for the interactive figure is large and can take several minutes to load in a browser)Back to article page