Do Cochrane summaries help student midwives understand the findings of Cochrane systematic reviews: the BRIEF randomised trial

Background Abstracts and plain language summaries (PLS) are often the first, and sometimes the only, point of contact between readers and systematic reviews. It is important to identify how these summaries are used and to know the impact of different elements, including the authors’ conclusions. The trial aims to assess whether (a) the abstract or the PLS of a Cochrane Review is a better aid for midwifery students in assessing the evidence, (b) inclusion of authors’ conclusions helps them and (c) there is an interaction between the type of summary and the presence or absence of the conclusions. Methods Eight hundred thirteen midwifery students from nine universities in the UK and Ireland were recruited to this 2 × 2 factorial trial (abstract versus PLS, conclusions versus no conclusions). They were randomly allocated to one of four groups and asked to recall knowledge after reading one of four summary formats of two Cochrane Reviews, one with clear findings and one with uncertain findings. The primary outcome was the proportion of students who identified the appropriate statement to describe the main findings of the two reviews as assessed by an expert panel. Results There was no statistically significant difference in correct response between the abstract and PLS groups in the clear finding example (abstract, 59.6 %; PLS, 64.2 %; risk difference 4.6 %; CI −0.2 to 11.3) or the uncertain finding example (42.7 %, 39.3 %, −3.4 %, −10.1 to 3.4). There was no significant difference between the conclusion and no conclusion groups in the example with clear findings (conclusions, 63.3 %; no conclusions, 60.5 %; 2.8 %; −3.9 to 9.5), but there was a significant difference in the example with uncertain findings (44.7 %; 37.3 %; 7.3 %; 0.6 to 14.1, p = 0.03). PLS without conclusions in the uncertain finding review had the lowest proportion of correct responses (32.5 %). Prior knowledge and belief predicted student response to the clear finding review, while years of midwifery education predicted response to the uncertain finding review. Conclusions Abstracts with and without conclusions generated similar student responses. PLS with conclusions gave similar results to abstracts with and without conclusions. Removing the conclusions from a PLS with uncertain findings led to more problems with interpretation. Electronic supplementary material The online version of this article (doi:10.1186/s13643-016-0214-8) contains supplementary material, which is available to authorized users.


Background
Although systematic reviews are widely accepted as a reliable source of evidence for health care, there are many challenges in overcoming barriers to their use in healthcare decision-making [1,2]. Related to this, systematic reviews often have to summarise findings from large and complex fields of research. The sheer volume of research alone requires good evidence summary versions to ensure that the key messages are communicated effectively and efficiently to the reader [3]. The Cochrane Library provides a collection of full-text systematic reviews developed using rigorous reporting standards and methods [4]. Each review has a plain language summary and a structured abstract, which includes a section for the authors' conclusions. Significant efforts have been made to improve accessibility and readability of Cochrane Reviews to a broad group of readers including practitioners, policymakers and healthcare consumers [5][6][7].
A key aspect of improving access to knowledge is to ensure not only that the content of the resource is appropriate but also that the format in which it is presented is fit for purpose. A range of evidence summary formats now exist that build on the recognised need for quick and easy access to evidence and have been categorised by Haynes into a pyramid of six 'S's [8]. Structured abstracts are acknowledged widely as the preferred summary format in health professional journals [9] and were developed to assist readers in retrieving, selecting and critically appraising relevant literature at a glance [10,11]. Plain language summaries (PLS) have evolved to summarise Cochrane Reviews in non-technical language that will be read and understood by a broad readership, including consumers of health care. As with abstracts, PLS will often be read as stand-alone documents [12]. They are increasingly used as a key means of disseminating evidence to different audiences including healthcare practitioners, consumers, journalists, policymakers and researchers [5]. Because evidence summaries such as abstracts and PLS are often the first, and sometimes the only, point of contact for research studies and systematic reviews, it important to identify how helpful they are to potential users, such as health professionals in training and practice.
Consideration has been given to how to communicate the benefits and harms of treatment [13] and on how we best present evidence [7,14]. For example, which evidence summary format, if any, is most open to erroneous interpretation needs to be considered. Recent studies have shown variability in the interpretation of results [7,15] and have highlighted the importance of authors' conclusions in interpreting summary statements. Lai et al. [16] conducted a cross-sectional study with hospital practitioners and medical students in which participants were shown four Cochrane Reviews without the authors' conclusions. The findings demonstrated that the majority of participants could not generate appropriate conclusions from these review abstracts, in the absence of the authors' conclusions.
It is also important to explore the impact of other influencing factors, such as prior knowledge, prior belief and complexity of the data being presented. Only 47 % of participants in Lai et al.'s study reported changing inappropriate prior belief about the intervention to something appropriate after reading the systematic review abstract [16]. Research in cognitive psychology suggests that we have many cognitive biases, i.e. thinking patterns based on observations and generalisations that can impact on our judgement [17,18].
Different professional groups have expressed varying degrees of confidence with using evidence resources to inform their practice. For example , Lai et al. [19] found that doctors were more confident and had a more positive perception of evidence-based practice (EBP) than nurses and allied health professionals. However, confidence and self-assessed knowledge is not a proxy for skills [20]. Professional education provides an early and important opportunity to embed the core skills and concepts of EBP into all aspects of healthcare training. While EBP is widely taught in higher education, concerns have been expressed as to how much the student is learning and how they are applying the knowledge to their broader understanding of policy and practice [21].
The aim of this randomised trial was to see how far the interpretation of systematic review summary formats by midwifery students agreed with expert consensus. The objectives were to determine whether there was a difference in interpretation between the use of structured scientific abstracts and non-technical PLS and, secondly, to determine whether the presentation of authors' conclusions was needed in order for the primary outcome of the reviews to be correctly interpreted.

Study design
A randomised trial was conducted using a 2 × 2 factorial design (abstract versus PLS, conclusions versus no conclusions). Evidence summaries were randomly allocated to participants using block randomization. The trial was administered via a paper survey to midwifery students at a set time before or after a lecture. A complete set of questionnaires were collated, placed in random order (in blocks of 12) and sequentially numbered before being distributed to the nine participating universities. This ensured that each student was randomly allocated to complete a questionnaire which included one of the four summary formats (abstracts versus plain language summary, with or without conclusions) for each of two selected Cochrane Reviews (one with clear findings and the other with uncertain findings).
The data collection process and questionnaire content were piloted with 118 students at Queen's University Belfast as part of a survey exploring use and attitudes towards evidence resources, including the Cochrane Library. As part of that survey, we asked them to read abstracts or PLS of Cochrane Reviews and we assessed their answers about these reviews. We excluded these pilot survey participants from the main Belfast Reviews of Interventions: Evidence Format trial (BRIEF trial) to avoid unblinding any aspect of the conduct or results of the main trial.
The students were asked to complete the first three sections (student details, use of evidence resources and prior knowledge of the topics of the Cochrane Review that were being studied) of the questionnaire before commencing Section 4, which contained the abstracts or PLS of the two Cochrane Reviews. The students were asked to indicate which of the six statements (shown below) best represented the findings of each review.

Participants and setting
Midwifery students attending one of nine universities in the UK and Ireland with recognised midwifery training programmes were recruited to the trial, between January 2013 and February 2014. Midwifery students registered for 18-month pre-registration course, 3-or 4-year preregistration course, post-registration and postgraduate modules were eligible to participate in the study. It was possible that some of the lectures would have other health professionals in attendance in addition to midwives, but the questionnaire was constructed to identify any participants who were not midwives and they were excluded from the analysis.

Ethics, consent and permissions
Ethical approval was obtained from the School of Nursing and Midwifery Ethics Committee at Queen's University Belfast (and local approval to conduct the study was also obtained from the ethics committee of each participating nursing department (Bangor University, Bournemouth University, Edinburgh University, National University of Ireland Galway, Swansea University, Trinity College Dublin, University of South Wales). The protocol as approved by the ethics committee can be found in the Additional file 1. The collaborator at each site coordinated the consent and questionnaire distribution in a classroom setting.

Randomisation
Based on the approximate number of eligible students, the random number function in Microsoft Excel was used to place the numbers 1 to 1000 in random order using random blocks of 12 for the four evidence summary formats: format A (abstracts with conclusion), format B (abstracts without conclusions), format C (plain language summaries with conclusions) and format D (plain language summaries without conclusions). Random allocation was in blocks in order to keep the sizes of the groups similar, and the blocks of 12 were used to preserve unpredictability as smaller groups sizes were not anticipated. The random number tables were generated by MC. The questionnaires were ordered in accordance with this random sequence (by FA) and then given a sequential number to ensure distribution in the appropriate sequence. A record which linked the random order to the sequence number was kept centrally so the questionnaires could be tracked. Each centre provided details on the maximum number of students available to participate, and a batch of questionnaires was then sent starting with the next available sequential number.
The students were emailed or given an information leaflet about the study during the week before the questionnaire was administered. On the day of data collection, all participants were advised again that participation was voluntary and that all answers were anonymous. Written consent was obtained from each student, and then the summary formats were presented to the students in the paper questionnaire. Questionnaires were distributed in order from the top of the collated pile of randomised questionnaires, working consecutively along each row of the students in the class. The questionnaire number was for randomisation only and was not linked to student consent or any other data. The students were asked to complete the questionnaires independently, and questionnaire completion was supervised by the researcher or class teacher at each site.

Study interventions
Students were randomised to receive one of the four evidence summary formats for two Cochrane Reviews in their questionnaire. These Cochrane Reviews had been purposively selected by the study team, one where the effects of the intervention were conclusive (from here on referred to as clear finding reviews) and one where the effects of the intervention were uncertain (as assessed by an expert panel). We considered the strength of the conclusions and the direction and size of effects as reported in the results to determine which reviews to use and identified a shortlist of 10 Cochrane Reviews covering topics of interest to midwives, with abstracts and PLS with similar levels of content. Two reviews related to maternity care were identified through student response to the pilot survey, discussion and consensus of the core study team [22,23]. Each of these reviews provided similar concluding statements in the abstracts and PLS. The reviews also provided numbers of trials, numbers of women and summary statements of the main findings in both summary formats. The main differences in these examples were the lack of structure in the PLS in comparison to the abstract, no methodological information in the PLS and the reporting of relative risks and confidence intervals for a number of outcomes in the abstract. GRADE statements or summary of findings tables were not presented with either format. The core study team also agreed on the key finding for each review, based on the information in the abstract and PLS, which we regarded as the appropriate response (primary outcome) for the selected reviews. The team also created five other responses for each review so that students could be presented with multiple choices as outlined below. The questionnaires used for each intervention group can be found at http://www.qub.ac.uk/ schools/SchoolofNursingandMidwifery/Research/ ResearchInPractice/Projects/BriefTrial/.

Primary outcome measure
The primary outcome measure for BRIEF was the proportion of students who identified the appropriate response to the review summary. The response categories were piloted and refined with a group of midwifery students to ensure clarity and meaningfulness of each category.
Responses to the summary were given using the following categories, which are based on the responses used by Lai et al. [16]: A. In general, [the intervention] is clearly beneficial. B. In general, [the intervention] is clearly not beneficial. C. In general, [the intervention] appears to be beneficial from limited evidence, more studies are needed to confirm the findings. D. In general, [the intervention] appears to be non-beneficial from limited evidence, more studies are needed to confirm the findings. E. There is insufficient evidence to comment on whether [the intervention] is, or is not, beneficial. F. I do not understand the results presented.
Identification of the appropriate response to act as the reference standard was based on a validation exercise conducted with a panel comprising four members of the core study team who are experienced reviewers and two independent experienced reviewers. The panel aimed to achieve consensus about the benefits and harms of the interventions as relayed in the abstracts and PLS and to arrive at an agreed appropriate response from the options in the questionnaire. The panel members were allocated one of the two sets of summary versions to assess. Each member highlighted the key text used to arrive at their response, and their decisions were based solely on the information provided. Additional comments were also noted and, where there was a lack of agreement, the panel discussed the different responses and arrived at a consensus response.

Sample size calculation
As good data were not available on which to base the sample size calculation, we chose the value that would require the maximum sample size, i.e. that the lowest proportion of students that would identify the appropriate response would be 50 % for any of the types of summary. Then, with 80 % power, we could detect an absolute improvement of 10 % or more, with a sample size of 400 in each group. That is, we would be able to detect a statistically significant odds ratio of 1.50 or above for providing the appropriate answer in either of the main comparisons (abstract versus PLS, conclusions versus no conclusions) with a total of 800 participants in the study.

Analysis
The primary outcome was the participant's answer to the question about the main result of the review. This was dichotomised as 'appropriate' and 'other', with a sensitivity analysis that further divided 'other' into 'near miss' and 'incorrect' (see below).
Logistic regression was used for the primary analysis of the trial, based on the primary outcome of whether or not each participant selected the appropriate response to the review summary. First, the main effects of each participant's randomised group were fitted (i.e. abstract versus PLS provided, and conclusions versus no conclusions provided). This method was then repeated, adjusting for potential confounders for choosing the appropriate answer, including (i) prior knowledge of the review, (ii) prior belief in the worth of the treatment, (iii) age group (< 20, 20-29, 30-39, 40+ years), (iv) course year, (v) type of midwifery course and (vi) university.
Sensitivity analyses were performed in two ways, firstly, categorising the primary outcome to 'appropriate', 'near miss' and 'other' and secondly, categorising it as 'appropriate', 'near miss', 'incorrect' and 'other'. Ordinal logistic regression was used for these analyses.

Results
Eight hundred thirteen midwifery students participated in the trial, representing 89 % of students who were enrolled on a midwifery course and therefore eligible to participate in the nine universities. Specific reasons for non-participation were not documented, and we do not have additional information on the students who did not take part. Figure 1 shows the flow diagram of the midwifery students who participated in the trial. The vast majority of the students were women, between 20 and 29 years old, and 74 % were attending a 3-year preregistration midwifery degree course. There was a good spread of participants across the 3 years of these courses. Two universities had a 4-year course, reflected in a small number of fourth-year students. Participant characteristics were similar across the four intervention groups (Table 1) Results of the adjusted analyses are shown in Tables 3  and 4. University was not a significant predictor of   (11) Pre-registration 18 months 15 (7) 18 (9) 18 (9) 18 (9) Post-registration course 13 (6) 16 (8) 17 (9) 17 (8) choosing the appropriate answer and has been omitted from the tables.   . There was a weak evidence that students aged less than 20 years were less likely to choose the appropriate answer (Table 4). All groups struggled with interpreting the evidence. The percentage of students identifying the appropriate response ranged from 32.8 % in the uncertain finding review presented as a PLS without conclusions through to 64.2 % in the clear finding review presented as a PLS with conclusions. The test for interaction between the two main effects was not statistically significant for either review (p = 0.39 for clear finding review, p = 0.47 for uncertain finding review), with the interaction ratio being 1.00 in both cases, demonstrating neither a synergistic nor antagonistic effect of the two interventions and, thereby, supporting assumptions underpinning use of the 2 × 2 factorial design [24].
After adjustment for confounders, the odds ratios for the main effects changed very little. Sensitivity analyses, using three and four categories of answer, rather than two, did not change the results meaningfully and are not shown.

Discussion
Overall, the proportion of appropriate answers was low. There was a small effect of having conclusions on identifying the appropriate answer for the uncertain finding review. Prior knowledge and belief predicted the response to the clear finding review but not to the uncertain finding review, and course year predicted the percentage of students giving an appropriate answer to the uncertain finding review. However, these subgroup analyses should be interpreted with caution, due to multiplicity [25].
Other studies report poor understanding of evidence presented in review summaries. Lai et al. [16] found that only 30 % of medical students and hospital practitioners who had to derive their own conclusion for Cochrane abstracts without conclusions chose the most appropriate conclusions, using similar response categories to those used in this study. The participants also performed most  [14] showed that a new Cochrane PLS format was preferred by members of the public compared to the current version, as used in this study. The new format increased overall understanding of participants from 18 % with the current Cochrane PLS format to 53 % with the new format. While this is encouraging, even this proportion is lower than desired and more work is needed to overcome barriers that still exist to understanding the information presented. The importance of conclusions, particularly when the results are unclear, highlights the careful consideration required to produce judicious statements that will not unduly influence the reader. There is a need for the use of clear language to communicate size and direction of effect and quality of evidence in reporting the results throughout the text of the review. Zhelev et al. [7] conducted a qualitative study with 21 clinicians, researchers and policymakers using Cochrane diagnostic test accuracy reviews and found that readers reported them as largely inaccessible with further problems arising from poor layout and presentation. Based on their findings, Zhelev et al. recommended clearer presentation of definitions, the use of more accessible formats for numerical results, careful wording of conclusions and an emphasis on the limitations of the results. Santesso et al. [14] found that understanding of the effects of the intervention and the quality of evidence was greater when communicated in qualitative statements as well as in numbers and symbols, with more people obtaining the correct answers when using a standardised language and presentation of key data in a table.
In our BRIEF trial, prior knowledge and prior belief predicted student response on the clear finding review but not the uncertain finding review. Current psychological research demonstrates that beliefs and heuristics impact on our decision-making [25]. Heuristics are simple, unconscious procedures that people use in everyday life to help them find adequate, though often imperfect, answers to difficult questions [17]. A number of heuristics and biases may have influenced decision-making in our study. For example, interpretation of evidence may be influenced by anchoring, which is the tendency to rely too heavily on one piece of information when making decisions. This is usually the first piece of information acquired on that subject but could be the concluding statement. Also, there is a tendency to search for, interpret, focus on and remember information in a way that confirms one's preconceptions (confirmation bias). Cognitive bias may be easier to correct in some contexts rather than others. For example, Lai et al. [16] found participants were more likely to change their beliefs after reading an abstract if the abstract reported findings of benefit compared to when the underlying review showed no benefit. Future research should consider heuristics and cognitive biases, to see what part they play in the process of interpreting and using evidence summaries and how they may be used to improve the reader's understanding of the evidence being presented.
Our study was conducted with midwifery students, the majority of whom were studying or had studied evidencebased practice as part of their pre-registration training. Reassuringly, educational experience appears to predict a better response when the findings of the review were uncertain, which suggests that current education is of benefit, when faced with challenging information. However, based on this and other studies we have a long way to go to provide summary evidence that enhances the understanding of a large proportion of evidence users. This is a challenge for evidence providers and educationalists alike. Decades of educational research has led to the suggestion that we need a paradigm shift to outcomebased education [26] which is an active, learner-centred and results-oriented approach to learning [27]. This approach would provide an opportunity to reflect on how we are teaching and assessing systematic reviewing, to explore what evidence is being used in teaching practice and how it is being used by teachers and students [21,28]. An important additional consideration when training health professionals is linking application of evidence to practical assessments which often drives student-based learning. Consideration should be given to how to introduce a higher level of application of appropriate evidence-based practice into practical assessments such as objective structured clinical examinations (OSCEs), virtual clinical simulation and clinical practice with patients [21]. More in-depth qualitative research could provide important insights into our results and provide valuable information on how we can support evidence users to effectively use the wealth of available evidence both in education and practice [7].
BRIEF was a large, randomised trial conducted in nine universities throughout the UK and Ireland. The method of data collection, randomisation process and survey responses were piloted in one university in advance of the main study. We conducted a validation exercise to identify the appropriate response for the two reviews. We did not assume that our expert panel had the correct answer rather we assessed the information presented and identified the appropriate response from the material provided to the student. Eleven percent of the eligible students did not participate, and we do not know if these students were different from those who participated. Despite the sample size, we did not have enough power to adequately explore an interaction between summary format and providing conclusions. Therefore, while no interaction was found, we cannot assess if an interaction was the cause of the poorest performing group (PLS without conclusions). Also, the clear finding review was updated in the later stages of data collection, which may have impacted on the students' response through recent exposure. However, although this was a high-profile midwifery review, the appropriate response for its overall findings did not change between the version in BRIEF and the updated version.

Conclusions
In summary, the BRIEF randomised trial evaluated how midwifery students interpreted the findings from Cochrane Reviews when the evidence was presented in two different summary evidence formats, abstract and PLS, with and without the provision of the authors' conclusion. Abstracts with and without conclusions generated similar student responses. PLS with conclusions were similar to abstracts. Removing the conclusion from a PLS with uncertain findings created more problems with interpretation. Prior knowledge and belief appeared to predict the response to the clear finding review but not the uncertain finding review, and course year appeared to predict the percentage of students giving an appropriate answer to the uncertain finding review. Future research should take into consideration what factors, in addition to format, might influence the interpretation of evidence summaries.

Competing interests
Fiona Alderdice, Mike Clarke, Toby Lasserson, Declan Devane, Elaine Beller, Vanora Hundley, Jane Noyes are all authors of Cochrane Reviews. Toby Lasserson is a Cochrane Review author, editor with the Cochrane Airways Group and is an employee of Cochrane as a senior editor with the Cochrane Editorial Unit, part of the Cochrane Executive Team.
Authors' contributions FA, JM, MC, TL, EB were involved in the design, conducting, analysis and write up of the BRIEF trial. MC, VH, JS, DD, JN, SK, SN, JW-D were all involved in data collection, editing and approving a final version of the paper. All authors read and approved the final manuscript.