Data extraction and comparison for complex systematic reviews: a step-by-step guideline and an implementation example using open-source software

Afifi, Mohamed; Stryhn, Henrik; Sanchez, Javier

doi:10.1186/s13643-023-02322-1

Systematic Reviews

Table 1 Summary of the steps and key messages of the proposed and the previous guideline of Li et al. [3]

From: Data extraction and comparison for complex systematic reviews: a step-by-step guideline and an implementation example using open-source software

Proposed guideline	Previous guideline of Li et al. [3]
Step 1: Determine data items • Identify the objective of the systematic review. • Identify the data items that are relevant to the research questions. • Use previous relevant reviews and eligible articles as a guide. • Determine how bias assessment data will be captured.	Step 1: Develop outlines of tables and figures • Develop outlines of the tables and figures that will appear in the SR beforehand.
Step 2: Group data items into distinct entities • Identify the hierarchal data structure. • Group the data items according to their level in the hierarchy. • Ensure that the entities are organised hierarchically, with the top-most entity capturing the data that only occur once in the article.	Step 2: Assemble and group data elements • Important characteristics that would modify the treatment effect or the association of interest should be collected. • Group data elements in the order in which they are usually found in study reports (e.g. starting with reference information, followed by eligibility criteria, intervention description, statistical methods, baseline characteristics and results).
Step 3: Specify the relationship among entities • Specify appropriate relationships among the entities. • When a data item at a higher-level entity is expected to correspond to many data items down the hierarchy, a 1:m relationship would best fit. • Determine the data items that will be used as primary and foreign keys. • Construct an ER diagram.	NE
Step 4: Develop a data dictionary • In addition to key messages reported in Steps 3 and 4 of Li et al. (2015) guideline [3], define variables’ names, labels, types, formats, lengths and other special requirements if needed. • Name the variables in a consistent way, so they can be easily recognised and used in statistical software. • The variables should be listed in the same order as they would appear in the data entry forms.	Step 3: Identify the optimal way of framing the data abstraction item • Ask closed-ended questions as much as possible. • Avoid asking a question in a way that the response may be left blank. Include ‘not applicable’, ‘not reported’ and ‘cannot tell’ options as needed. • Open-ended questions are useful when it is not possible to anticipate the different responses that may be given or when it is necessary to avoid leading the data abstractors by indicating permissible replies. • Remember that the form will focus on what is reported in the article rather than what has been done in the study. • Ask 1 question at a time to avoid confusion. • When a judgement is required, record the raw data (i.e. quote directly from the source document) used to make the judgement. • Record the data as provided in the source document to minimise the mathematical manipulations required during DE.
Step 5: Create data entry forms • The order of the forms and data entry fields needs to closely follow the reporting flow of the information in the articles. • Use quality control checks, such as value range, field type, and logic checks, whenever applicable.	Step 4: Develop data abstraction forms • Develop data abstraction forms using word processing software to serve as a guide for creating an electronic data abstraction form and a codebook. • Definitions and instructions helpful for answering a question should appear next to the question to improve quality and consistency across data abstractors. • The quality control checks were reported further later in Step 7.
Step 6: Setup database • Review the software’s manual or user guide to build and connect the database tables.	NE
Step 7: Pilot the DE tool • Take a purposive sample of studies with results reported in different ways. • Check for difficulties such as (1) forms are not working properly; (2) improper storage of the data; (3) omission of the logic or range checks; (4) incorrect labelling of variables or dropdown menu categories; and (5) missing relevant data items. • Reviewers, statisticians and content experts should be engaged in the piloting of the tool.	Step 5: Set up and pilot-test data abstraction forms in the SRDR • Develop a user manual with instructions, coding conventions, and definitions specific to the project. •Testing the DE tool should involve several persons abstracting data from at least 3 articles.
Step 8: Documentation and reviewer training • Develop a comprehensive manual with detailed instructions on filling in the data fields and navigating among forms. • Supplement the manual with practical examples to help reviewers understand the data items and extract reliable data. • The articles used in training should be selected to show a variety of data reporting. • The entire review team, including data extractors, clinicians, and methodologists, needs to be involved in this step. • Each data item should be thoroughly described.	Step 6: Train data abstractors • Training should include modules to familiarise the review team with the data system and data abstraction form. • Complete the general SRDR training modules. • Data abstractors should have a basic understanding of the clinical issues surrounding the topic, study design, analysis, and statistics. • Pay attention to details while following the instructions on the forms and the user manual. • Training sessions should take place at the project onset and intermittently over the course of the project.
The key messages reported in Step 7 of Li et al. [3] are included in different steps of the proposed guideline; for instance, double data extraction and comparison are covered in Step 10. The logic checks were also reported in Step 5.	Step 7: Implement a quality assurance and control plan and monitor the progress • We recommend having 2 data abstractors who work independently to collect data on the SRDR. • The Data Comparison Tool in the SRDR. • Create Logic checks • Monitor the timeliness of data abstraction and progress.
Step 9: Data export and compilation • Depending on the data structure, data processing, e.g. merging and concatenation, are needed to assemble separate datasets exported from the database into a single dataset.	Step 8: Export and clean the data for analysis • A specific subset of data can only be exported from SRDR. • Each worksheet contains data collected from 1 tab in the SRDR. • Data can be imported into statistical software for processing and analysis.
Step 10: Data comparison and adjudication • Split the dataset into subsets of variables, depending on the hierarchical level at which they were recorded. • More frequent comparisons and adjudications are better than waiting until data have been extracted from all studies.	• A tool for data comparison implemented in the SRDR software was described in step 7.

NE: a corresponding step does not exist

Back to article page

ISSN: 2046-4053

Contact us

Submission enquiries: Access here and click Contact Us
General enquiries: info@biomedcentral.com