Skip to main content
Fig. 2 | Systematic Reviews

Fig. 2

From: Toward systematic review automation: a practical guide to using machine learning tools in research synthesis

Fig. 2

Bag of words modelling for classifying RCTs. Top left: Example of bag of words for three articles. Each column represents a unique word in the corpus (a real example would likely contain columns for 10,000s of words). Top right: Document labels, where 1 = relevant and 0 = irrelevant. Bottom: Coefficients (or weights) are estimated for each word (in this example using logistic regression). In this example, high +ve weights will increase the predicted probability that an unseen article is an RCT where it contains the words ‘random’ or ‘randomized’. The presence of the word ‘systematic’ (with a large negative weight) would reduce the predicted probability that an unseen document is an RCT

Back to article page