Predictive models of diabetes complications: protocol for a scoping review

Background Diabetes is a highly prevalent chronic disease that places a large burden on individuals and health care systems. Models predicting the risk (also called predictive models) of other conditions often compare people with and without diabetes, which is of little to no relevance for people already living with diabetes (called patients). This review aims to identify and synthesize findings from existing predictive models of physical and mental health diabetes-related conditions. Methods We will use the scoping review frameworks developed by the Joanna Briggs Institute and Levac and colleagues. We will perform a comprehensive search for studies from Ovid MEDLINE and Embase databases. Studies involving patients with prediabetes and all types of diabetes will be considered, regardless of age and gender. We will limit the search to studies published between 2000 and 2018. There will be no restriction of studies based on country or publication language. Abstracts, full-text screening, and data extraction will be done independently by two individuals. Data abstraction will be conducted using a standard methodology. We will undertake a narrative synthesis of findings while considering the quality of the selected models according to validated and well-recognized tools and reporting standards. Discussion Predictive models are increasingly being recommended for risk assessment in treatment decision-making and clinical guidelines. This scoping review will provide an overview of existing predictive models of diabetes complications and how to apply them. By presenting people at higher risk of specific complications, this overview may help to enhance shared decision-making and preventive strategies concerning diabetes complications. Our anticipated limitation is potentially missing models because we will not search grey literature.


Background
The World Health Organization identifies diabetes as one of the four priority non-communicable conditions [1]. In 2017, more than 693 million people were affected by diabetes worldwide and projections point to a sustained rise in its prevalence in the next decades [1]. The burden of diabetes on individuals and health care systems is primarily attributed to complications from diabetes including macrovascular complications (e.g., heart attack, stroke) or microvascular complications (e.g., blindness, amputation, renal failure) [1,2]. Early identification of people with diabetes at increased risk of complications is an important challenge for clinicians [3]. Models predicting the risk (also called predictive models) of diabetes complications can facilitate the identification of people at higher risk and inform health decision-making regarding preventive actions or treatments to avoid or delay complications [4].
Models that assess the risk of developing diabetes or that use it as a predictor variable for other outcomes are not informative for someone who is already living with diabetes (i.e., patient) [5,6]. Similarly, predictive models of other conditions in people with diabetes often compare people with and without diabetes, which is of little to no relevance for patients [7][8][9]. A preliminary search for reviews on the topic was conducted in two databases (MED-LINE, Embase), and results suggest that existing reviews of predictive models of diabetes-related complications focus mostly on macrovascular complications [10,11] and rarely on the range of other diabetes complications [4,12]. This scoping review will contribute to filling these gaps.
We aim to identify and synthesize existing predictive models of physical and mental health conditions associated with diabetes, in people with prediabetes and any type of diabetes mellitus (hereafter called "patients"). Our objective is to describe the features of selected validated predictive models for risk of diabetes complications.

Methods/design
In this scoping review, we will use well-established scoping review methods, namely the framework developed by the Joanna Briggs Institute [13,14] and Levac and colleagues [15] while paying attention to the methodological limitations of original studies as often recommended in systematic reviews [16]. In some epidemiological contexts, such as the one we are focusing on, it is important to assess studies' qualities even if it does not add to the methodological strength of the scoping review itself. For example, in an ongoing scoping review, authors aimed to assess the number of validated prediction rules that exist for spinal cord injury management and to provide evidence of the psychometric properties of these prediction rules, especially with regard to its clinical impact [17]. Although their scoping approach does not aim to assess the overall effectiveness of these prediction rules in their respective settings, their systematic appraisal of data quality will help readers make informed use of their findings. In another ongoing study, authors aimed to "produce a scoping review which in its data analysis will draw on methods typically associated with qualitative systematic reviews" and acknowledged that the diversity of data "presents a potential challenge from the perspective of interrater reliability and consistency in analysis" [18].
To include a diversity of perspectives and ensure that our review focuses on diabetes complications that are relevant to patients [19], our research team include researchers (RN, IF, GN, HW) and stakeholders such as clinicians (CF, BS, CY, NI, SS) and patients with type 1 and type 2 diabetes (DG, DA, HW). Stakeholders were involved in this study as collaborators and co-authors, not participants. Patients in our research team (hereafter called Expert Patients) were recruited through Diabetes Action Canada (DAC), a national Patient-Oriented Research Network that includes patients to bring expertise in diabetes care [20]. Expert Patients were recruited to DAC through professional and personal networks and community-based organizations and from respondents to a national survey [21]. Using a patient-centered approach, the team co-developed the protocol. We integrated patient' priorities by developing our research questions, search strategy terms, and outcome measures based on what Expert Patients shared concerning what matters to them, and also by building on findings of a recent patientcentered study [21]. Expert Patients (DG, SD) will be involved in each step of the research process, including the definition of the objective, the main analysis, the preliminary and final results, and the discussion. We will discuss preliminary and final results with a broader committee of six to ten Expert Patients. We will use the services of two information specialists to validate our search strategy and selection criteria at least twice before the end of this review.

Eligibility criteria Population
The population targeted by this scoping review consists of people of all ages, genders, and ethnicities affected by diabetes. We will consider prediabetes and any type of diabetes, including type 1, and type 2 diabetes [22], and data that have been collected at the individual level, not the group level [23]. We will consider both treated and nontreated individuals. Studies mixing people with and without diabetes will not be considered, unless they performed separate stratified analyses for individuals with diabetes and without diabetes. Studies of pregnant women and/or gestational diabetes will be excluded because it is a different clinical condition. Studies that are restricted to people who do not have diabetes will not be considered. Models based on the Framingham Risk Score of cardiovascular conditions will not be considered as this score was originally derived from a general population free of diabetes [24]. Studies involving people not meeting our eligibility criteria will be excluded.

Concept
We will consider both clinically diagnosed and selfreported physical and mental conditions experienced by patients as a consequence of living with diabetes. Studies focusing on social or economic consequences of diabetes will not be included in this review, because findings are likely to be highly dependent on country of residence and health insurance status and thus are unlikely to be modifiable at the individual level. We plan to sort models by diabetes type and by groups (e.g., sub-group) of diabetes complications, physical (e.g., macrovascular and microvascular conditions), and mental (e.g., depression and anxiety) health problems. Death from all causes and death from non-diabetes complications will be analyzed separately. With the collaboration of Expert Patients and researchers, we drafted a preliminary and non-exhaustive list of diabetes complications that were relevant for patients (Table 1).

Context
(1) We will consider evidence coming from all countries and settings and published between 2000 and 2018. We will not consider articles prior to 2000 because both diabetes treatment and modeling approaches have greatly improved in the last two decades. The date of publication will not be included in the search strategy. Rather, we will simply order the results by date of publication and will not consider those outside the period 2000-2018.
(2) We will include only full-text peer-reviewed published studies with original results as they are expected to exhibit high-quality models and detailed methodology. For this reason, we will not consider abstracts only or duplicates and do not intend to search the grey literature. (3) No language restrictions will be applied. During the full-text screening, potentially relevant articles written in a language other than English or French will be translated by a member of our team when possible. If we do not have anyone with expertise in that language, we will first use free translation tools (e.g., Google Translate, DeepL) to determine if the publication is likely to meet our inclusion criteria, and if so, we will engage professional translation services. (4) We will only consider studies with a longitudinal design and quantitative data. Specifically, we will consider prospective cohort studies and nested case-control studies [25]. We will not apply restrictions as to the length of follow-up as the time may vary for diverse reasons. Screening tools/studies, retrospective case-control studies, and cross-sectional studies will not be considered. Focusing on predictive models implies that we will not consider explicative ones, that is, those evaluating factors associated with diabetes complications as potential determinants or confounders rather than predictors. We will consider diverse candidate/potential predictors of diabetes complications, including personal characteristics, socioeconomic factors, clinical factors, and environmental factors. (5) We will focus only on prognostic models and not include diagnostic models in this review. We will consider both development and validation studies, as some studies presenting predictive models are focused on derivation and internal validation and others on external validation. The sample size for model validation can come from the same study population, from another study population, or from both. We will exclude partial and full predictive models that were not validated, either internally or externally.

Search strategy and information source
Our diverse team co-built the search strategy of this scoping review. A predefined list of potential predictors and complications [4] was established in collaboration with six Expert Patients who were not members of our research team in order to better capture what matters to diverse patients. This list will be used as a starting point for study selection and will be revised during the full-text screening process (Table 2). The search strategy will combine groups of keywords customized to each database (i.e., MeSH terms where appropriate) pertaining to (1) population (treated and untreated patients affected by prediabetes and diabetes), (2) concept (diabetes complications, potential predictors), and (3) context (prediction modeling features). Prediction models seldom report the individual predictors included in the final model as the central message is about accuracy (discrimination and calibration). However, knowing which, how, and what candidate predictors have been assessed can help explore potential bias (e.g., selection bias) in data that may, in turn, influence the features of predictive models [26,27]. For this reason, we will add potential predictors in our search strategy. Search terms are selected to capture international terminology. We intend to run a search at the start and again just before final data extraction to identify studies published after our baseline search date and before we write the article for possible inclusion in our review. As mentioned in eligibility criteria, there will be no restrictions in terms of date, language, age, or design. We will search for eligible studies in two electronic scientific databases: Ovid MEDLINE and Embase. In addition, we will perform snowballing of reference lists of selected papers at the full-text screening stage [28]. To complement these sources, we will contact experts in the field to ask if they know about any published work we may have missed. We tested our search strategy for MEDLINE (Ovid) in June 2018 and for Embase in October 2018 (see Appendixes 1, 2, and 3). We had the search appraised by a second librarian using PRESS in October 2018 [29].

Data management
The detailed references and abstracts identified will be pooled in EndNote, a reference management software [30]. We will use EndNote to remove duplicates and store references before moving to another tool to screen references and extract data. Duplicates will be removed using the automatic function in EndNote and manually during screening. Screening by title, abstract, and full text will be conducted using Microsoft Excel [31] to provide a comprehensive step-by-step record of the selection process based on our selection criteria. A detailed screening form with the inclusion and exclusion criteria will be developed and tested (see Appendixes 1 and 2, Tables 4 and 5). All members of the screening team will be trained on how to use Microsoft Excel and the screening form before we start.

Selection process
Articles will be excluded if at least one of the criteria was clearly not met. We will retain any article that cannot be excluded solely based on abstract review. We will set aside all articles that are systematic or narrative literature reviews whose subject clearly relates to our objective to consult at a later stage, as mentioned previously.
Given that reviewers have diverse research backgrounds and levels of experience, we plan to screen titles and abstracts in two different steps to make sure that they have a similar understanding of the eligibility criteria. A preliminary convenience sample of 50 titles will be screened by all reviewers, and we will assess the degree of agreement among raters, discuss any disagreement in groups, and only proceed above a predetermined threshold of interrater agreement (such as 70%). Then, pairs of reviewers from among the seven team members (CF, IF, JC, SC, SRB, JM, YY) will independently screen a subset of titles based on the Population-Concept-Context (PCC) criteria. After titles are screened independently by two reviewers, the results will be pooled and agreement will be calculated for each pair. If agreement is optimal, all titles retained by at least one reviewer will be considered for abstract screening. If agreement is not optimal, title screening will be repeated by independent reviewers until we meet the target of 0.7 or higher. Reviewers will meet at the beginning, midpoint, and final stages of the abstract review process to discuss discrepancies related to study selection and refine the search strategy if needed [15]. Once abstract screening has been completed by two independent reviewers, the results will be pooled and agreement will be calculated for each pair of reviewers. When agreement is optimal, all remaining disagreements will be discussed between the two reviewers. If agreement is not optimal, two independent reviewers will screen abstracts until we meet the target agreement of 0.7 or higher. A third reviewer will screen abstracts where there are discrepancies and discuss all remaining disagreements in meetings with the two initial reviewers. Full-text copies of articles selected based on abstracts will be retrieved and translated if needed. Two independent reviewers from our team (RN, CRB, TP) will screen the full text of all selected references. Each pair of reviewers will compare their results and discuss any disagreement. If there are too many disagreements, a third reviewer will repeat the full-text screening. Differences and disagreements between reviewers will be discussed in group meetings to reach a consensus. All remaining discrepancies will be resolved by one researcher (GN, HW).

Data collection process
The team will collectively build a standardized extraction grid with all relevant data items to guide data extraction. Three independent reviewers (RN, TP, CRB) will pilot test the grid using a subset of five to twenty full-text articles selected for extraction. They will then meet to determine whether data are missing from the form or not needed. Data extraction will be performed in duplicate by two independent reviewers from our team (RN, TP, CRB). The corresponding authors of retained articles may be contacted to request any information missing in the extraction grid. The three reviewers will resolve discrepancies through discussion and with input from two members of our team (RN, HW) when necessary.

Data extraction
Since there are no checklists of items to consider in data extraction for scoping reviews on risk prediction models, we considered aspects of a well-known checklist for systematic reviews [32] that aligns with the scoping review methodology to design (and, in future, report) our data extraction process [15]. Full-text data extraction will be done by two independent reviewers (TP, CRB) using an Excel spreadsheet. A third reviewer (RN, GN) will review any studies where there is a discrepancy between the two independent reviewers that they are not able to resolve. Although scoping reviews do not usually include quality assessment, when dealing with epidemiological models, it is important to pay attention to the methodology and the design of original studies [17]. Two independent reviewers trained in epidemiology (RN, IF, GN) will be involved in assessing potential selection and information bias in selected studies and will discuss the potential impact of bias on the features and accuracy of selected models. Final selection of articles will be undertaken in duplicate following data extraction to confirm relevance of the chosen articles. Any study selected by only one reviewer will be discussed to reach mutual agreement. We will record the reasons for which each article is excluded. Here again, a third reviewer will review each study when there are discrepancies that cannot be resolved by the two independent reviewers. We will use the pre-publication version of the PRO-BAST [33], which includes a template and a detailed user guide to identify five domains in which methodological limitations might exist in studies using risk prediction models. These domains are as follows: (1) participant selection (e.g., selection bias caused by exclusion of eligible participants or loss at follow-up); (2) predictors (e.g., differential or non-differential misclassification of predictors, change in predictor for some participants over time); (3) outcomes (e.g., outcome definition and standardized classification of all participants); (4) sample size and participation flow (e.g., inappropriate time interval between predictor and outcome measurements, handling of missing data); and (5) analyses (e.g., evaluation of performance measures such as calibration, discrimination, (re)classification, and net benefit [34][35][36]; handling of non-binary predictors) ( Table 3). Other methodological issues will also

Results
Name of each outcome, frequency estimates of outcomes, estimates with confidence intervals or p values for each prediction model by predictors and by diabetes-related complications, alternative presentation of the models 7. Potential limitations Selection bias (percentage participation at baseline and at follow-up, missing data), information bias (measurement of exposure and/or outcome), lack of power, statistics of the performance of the model (validation, calibration, discrimination 8. Interpretation Utility of presented models, generalization of the findings be considered (e.g., duration and timing of exposure, selective reporting of results in a way that depends on the findings) [37]. Also, if both predictors and outcomes were measured using self-report methods, we will evaluate potential common method bias [25].We will use the same spreadsheet for data abstraction and for quality assessment. We will make sure that we adequately capture all relevant content and methods from selected papers and summarize information on the internal and external validity of each selected model from each selected study. Consistent with the PROBAST tool, we will sort studies in three groups: high quality, moderate/acceptable quality, and low quality. These data will help assess data quality during data analysis and interpretation.

Analysis and synthesis
This protocol adheres to the Preferred Reporting Items in Systematic Reviews and Meta-analyses extension for protocols (PRISMA-P) [38] and scoping reviews (PRISMA-ScR) [39] (see the Additional file 1). After data from included studies are summarized in an extraction table, we will follow three distinct steps: analysis (models features, discrimination, calibration and validation), reporting (synthesizing characteristics of included studies), and discussion (comparison with previous reviews) [15,40]. The analysis and synthesis will focus on diabetes complications and the methodological features of selected models [11]. We will use qualitative approaches to evaluate and synthesize quantitative estimates accurately. When relevant, we will provide in-depth analyses of potential explanations for data inconsistencies (i.e., study design, selection/participation, data measurements, etc.). Finally, we will propose how to consistently report the risk of diabetes complications in predictive models in ways that will be helpful for patients and clinicians.

Discussion and conclusion
The current review may not provide meta-analytical estimates because we expect to retrieve a highly diverse set of risk prediction models. This may preclude a quantitative synthesis if the available data do not meet the criteria for homogeneity in methods used to measure predictors and outcomes and assess biases potentially affecting internal validity. Heterogeneity is one of the main reasons for skepticism about meta-analyses of non-experimental studies [25,41], which represent the great majority of studies on our topic [4,6]. To partly circumvent the pitfalls of heterogeneity, we will attempt to calculate a metaanalytical estimate of experimental studies if there are enough high-quality data with comparable methodological characteristics in our final set of models (N > 5). However, preliminary search results and consultation with experts revealed that predictive models of diabetes complications often consider some complications as predictors of other complications [4]. Merging such models during analysis may lead to a highly correlated data and inflation in the estimates of variance [42,43]. In such cases, qualitative approaches are often alternatives used to evaluate and synthesize estimates accurately.

Strengths and limitations of this study
The major strengths of this review will be the inclusion of predictive models of diverse diabetes complications and the combination of multiple and diverse perspectives of patients, clinicians, and researchers. Considering the fact that diabetes complications often vary by diabetes types, we invited one patient partner with type 2 diabetes (DG) and one patient partner with type 1 diabetes (SD) as co-authors to complement the perspective of our senior researcher (HW) who lives with type 1 diabetes. All six Expert Partners that we consulted agreed that all complications considered in this review were equally important. We plan to actively collaborate with a committee of Expert Patients, caregivers, and clinicians in diabetes care. By including a consultation exercise in this scoping review, we intend to "enhance the results, making them more useful to policy makers, practitioners and service users" [44]. Limitations include using two databases, restricting publication date to 2000-2018, and not searching the grey literature. Also, we will not consider the social and economic outcomes of diabetes.

Dissemination
Ethical approval is not required for this scoping review study since we will only be using secondary data sources. Our findings will be disseminated through peer-reviewed publication and presentation at conferences. Because predictive models are increasingly being appraised and recommended for formal risk assessment in treatment decision-making and clinical guidelines, the proposed scoping review may contribute to support research and risk communication in diabetes care. For example, it may help clinicians better identify people who are at higher risk of diabetes complications and researchers design customizable risk prediction tools for use in diabetes care [45]. To ensure that our findings about diabetes complications reach patients, we will also circulate them through clinical and patient networks. 15 "emigrants and immigrants"/ or undocumented immigrants/ or population groups/ or continental population groups/ or african continental ancestry group/ or african americans/ or american native continental ancestry group/ or alaska natives/ or indians, central american/ or indians, north american/ or indians, south american/ or inuits/ or asian continental ancestry group/ or asian americans/ or european continental ancestry group/ or oceanic ancestry group/ or ethnic groups/ or amish/ or arabs/ or roma/ or hispanic americans/ or mexican americans/ or jews/ or "geographicals (non mesh)"/ or geographic locations/