Fig. 4From: Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literatureConfusion matrix for class (A), subclass (B), and sub-subclass (C). The ensemble has a higher probability of confusing sub-subclasses inside their nested subclasses and classes which is why performances tend to be higher at those higher levelsBack to article page