Overview of evidence-based clinical practice guidelines for preoperative care: a systematic review.

Background Our aim was to summarize and compare relevant recommendations from evidence-based CPGs (EB-CPGs). Methods Systematic review of clinical practice guidelines. Data sources: PubMed, EMBase, Cochrane Library, LILACS, Tripdatabase and additional sources. In July 2017, we searched CPGs that were published in the last 10 years, without language restrictions, in electronic databases, and also searched specific CPG sources, reference lists and consulted experts. Pairs of independent reviewers selected EB-CPGs and rated their methodological quality using the AGREE-II instrument. We summarized recommendations, its supporting evidence and strength of recommendations according to the GRADE methodology. Results We included 16 EB-CPGs out of 2262 references identified. Only nine of them had searches within the last five years and seven used GRADE. The median (percentile 25-75) AGREE-II scores for rigor of development was 49% (35-76%) and the domain ‘applicability’ obtained the worst score: 16% (9-31%). We summarized 31 risk stratification recommendations, 21.6% of which were supported by high/moderate quality of evidence (41% of them were strong recommendations), and 16 therapeutic/preventive recommendations, 59% of which were supported by high/moderate quality of evidence (75.7% strong). We found inconsistency in ratings of evidence level. ‘Guidelines’ applicability’ and ‘monitoring’ were the most deficient domains. Only half of the EB-CPGs were updated in the past five years. Conclusions We present many strong recommendations that are ready to be considered for implementation as well as others to be interrupted, and we reveal opportunities to improve guidelines’ quality.

population growth, 6.2 billion people (73% of the world's population) will be living in countries below the minimum recommended rate of surgical care in 2035.
[2] However, the crude number of patients who receive surgery is increasing, as well as their mean age and the occurrence of comorbidities. [3] Because of the inherent risks of death and complications, surgical safety is a significant public-health concern. As an example, more than 2% of patients undergoing surgery will suffer major cardiac complications. [4] In this context, to provide adequate preoperative care is truly mandatory. The first routine preoperative tests started 50 years ago with only a handful of actions, and have nowadays expanded to a large set of risk stratification and preventive interventions. Lately, efforts to standardize care have been made, specially through the implementation of clinical practice guidelines (CPGs) with recommendations useful both for health providers and patients. [5] These recommendations usually consider all risks and benefits for a risk stratification or therapeutic procedure to be undertaken, sometimes even including algorithm pathways. The potential benefits, like the safety of care and standardization of procedures are only as good as the quality of the practice guidelines implemented. Unfortunately, those CPGs not supported by the best evidence might promote inappropriate preoperative testing behaviors, negative both for patients and health systems. For example, false positive results, coming from inappropriate testing, may delay or prevent surgery, thus creating unnecessary stress or harm to patients.
Multiple medical societies and organizations around the world have published preoperative evaluation CPGs; however, many of them are not even based on solid scientific evidence. Additionally, not all of them harness methods like the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach, which is one of the soundest system for rating the quality of a body of evidence in systematic reviews and CPGs. [6] GRADE offers a transparent and structured process for developing and presenting evidence summaries and making recommendations. [6] A systematic review found no evidence from high quality studies to support routine preoperative tests in healthy adults undergoing non-cardiac surgery. [7] Risk stratification testing based on the problems identified during the preoperative assessment seems justified, but there is still little evidence supporting it. [7] In this way, the implementation of EB-CPGs may lead to a reduction in the number of unnecessary preoperative tests, without affecting patient safety. [8][9][10][11][12] The first health technology assessment (HTA) on the topic published in 1989 by the Swedish Council on Technology Assessment in Health Care (SBU) [13], showed healthcare quality improvements and cost savings using an evidence-based approach. The findings of this report have been confirmed by nine other subsequent studies from five countries, collected in another HTA document. [14] For this reason, through an overview of clinical practice guidelines we aimed to identify and synthetize EB-CPGs on preoperative care that were published worldwide in the last 10 years, in order to help prioritization processes. We also rated CPGs' quality and summarized recommendations describing their level of evidence and the strength of recommendations according to the GRADE approach. [6] Methods Study design: Systematic review (overview) of EB-CPGs following Cochrane methods [15] and the Argentinean Academy of Medicine's Guide for the adaptation of CPGs when searching for and selecting CPGs. [16] For reporting, we followed the PRISMA statement [17] and a specific guideline for overviews of systematic reviews (Online supplemental material. Appendix 1. PRISMA checklist). [18] The protocol available in Spanish including a summary in English. [1] Inclusion criteria of EB-CPGs [19] on preoperative care were as follows: a) description of the guideline's development expert panel; b) use of standard methods for identification, data collection and study risk of bias assessment; c) report of the level of evidence that supports each recommendation. Guidelines were excluded if they were limited to single specific conditions since we focus more general recommendations. If guidelines referred only for preoperative care for neurosurgery or colorectal surgery then we excluded them. Discrepancies were resolved by a consensus of the whole team.
Guideline quality appraisal and classification: Independent pairs of reviewers rated each EB-CPGs using the AGREE-II tool consisting of 23 key items organized in six domains: scope and purpose, stakeholder involvement, rigor of development, clarity of presentation, applicability, editorial independence and two overall evaluation items. [21] Each item were graded using a scale of 7 points: from 1, meaning "Strongly disagree", to 7, meaning "Strongly agree". We also categorized each EB-CPGs according to the extent to which they successfully addressed AGREE-II criteria as: [16] Strongly recommended (++): CPG whose standardized score exceeds 60% in ≥4 AGREE-II domains.
The scores of the remaining domains must be ≥30% and >60% for the domain rigor of development.
The rigor of development score must be between 30% and 60%.
Not recommended (-): CPG whose standardized score is <30% in ≥4 AGREE-II domains or if rigor of development score is less than 30%.
To deal with discrepancies between the direction and strength of the CPG recommendations, we applied a rule to decide "doing or not doing the recommendation": there are three criteria that can upgrade one or two levels: magnitude of effect, dose-response effect, and confounders underestimating the effect. For the mapping the level of evidence to a common grading system (GRADE), we re-assessed all evidence when the translation was not obvious. Regarding the strength of a recommendation, which is defined as the extent to which one can be confident that the desirable consequences of an intervention outweigh its undesirable consequences, GRADE uses four simple categories to classify them. The categories are 'strong' or 'weak', and 'for' or 'against' a certain risk stratification or therapeutic approach. We presented descriptive statistics as percentages or means with standard deviations.

Results
The search strategy identified 2262 references after the elimination of duplicates. After the selection process we identified 23 references corresponding to 16 EB-CPGs published in the last 10 years (Figure 1: Flowchart). Two references were examined in depth and eventually excluded since they only transcribed pre-existing CPGs, already included in our selection. [22,23] 5). An overall AGREE-II score was also presented in Table 1.

Discussion
To our best knowledge the present study is the first overview of guidelines encompassing a broad spectrum of preoperative care recommendations.
We observed higher level of evidence supporting therapeutic than risk stratification recommendations (high/moderate quality of evidence 59 vs 22%, respectively). It is not surprising, because cross sectional or cohort studies can provide high quality evidence for test accuracy but indirect evidence for patient-important outcomes. Furthermore, highs level of heterogeneity is almost the rule in risk stratifications test, downgrading even more the level of evidence because inconsistency. [25][26][27] The strength of a recommendation is defined as the extent to which one can be confident that the desirable effects of an intervention outweigh its undesirable ones. We found only 12/53 (23%) 'strong' risk stratification recommendations statements (for and against) based on high/moderate level of evidence and 43/78 (55%) for therapeutic/preventive care recommendation. Although it would be desirable higher proportions of high-quality supporting evidence a guide panel must consider additional factors. In order to assess competing management alternatives, GRADE proposes to consider four domains: estimates of effect for desirable and undesirable outcomes, confidence in the estimates of effect, values and preferences, and resource use. Guideline panels must integrate these factors to make a strong or weak recommendation for or against an intervention. [28] After our search date, the updated guideline from the European Society of Anesthesiology (ESA) was published, using GRADE and searching until May 2016. [29] This CPG addressed two main clinical questions in order to help each anesthesiologists in their daily practice: 1. how should a pre-operative consultation clinic be organized and 2. How should pre-operative assessment of a patient be performed. As in our present work, this guideline covered specific conditions that might adversely interfere with anesthesia and surgery, including cardiovascular disease, respiratory disease, smoking, obstructive sleep apnea syndrome, renal disease, diabetes, obesity, coagulation disorders, anemia and pre-operative blood conservation strategies, the geriatric patient, alcohol and drug misuse and addiction and currently also neuromuscular disease. We are hereby presenting a preoperative clinical risk criteria and categories that was complemented with established risk factors for postoperative pulmonary complications (See Online supplemental material Appendix 3). [29] The 2018 ESA guidelines also provided independent predictors for difficult mask ventilation, a topic not specifically addressed in previous CPGs. [29] As described, RCTs are still few and therefore many preoperative interventions rely to a large extent on expert opinion, which in turn requires to be adapted to the reality of nations' healthcare systems.
Studies on prognostic or diagnostic accuracy tests, including scoring of severity of illness, usually provide low quality of evidence, even when scores such as ASA-PS, RCRI, NSQIP-MICA, POSSUM and others have been extensively validated. [29] Our updated overview of EB-CPGs, conducted under the rigorous Cochrane methods, may be a useful resource for the professionals involved in preoperative care to consult during decision-making. We present many strong recommendations with sufficient evidence to be routinely implemented in clinical practice. However, any decision should be taken considering local contextual factors.
In addition, cost reductions were identified at the clinical level as well as at the health system level in another study. [9][10][11]30] Two guidelines also suggested strong costs benefits both for patients and society. [34,35] Another study showed that the application of EB-CPGs significantly improved the efficiency of the preoperative evaluation without negatively affecting the quality of care. [31] These findings were consistent across different settings, like in a hospital in Barbados where the introduction of guidelines reduced the burden of presurgical tests and costs with not hampering patient's safety. [32] In the same way, a recent study in a hospital in New Jersey, USA, found that approximately 25% of tests were not justifiable and could be thus eliminated by complying with NICE/ASA guidelines. The evaluation of applying these changes in practice showed significant savings without altering clinical outcomes. [33] Recommendations can be adopted, modified or even not implemented, depending on institutional or national requirements and legislation and local availability of devices, drugs and resources. [34] Decision-makers at the national and subnational levels should be provided with the information they need to apply the evidence and recommendations in their setting.
[35] As a limitation, including only EB-CPGs could have resulted in omitting some information, but we prioritized summarizing the highest quality evidence. Our exclusion criteria for CPGs, limiting the scope to specific conditions, may represent an additional caveat since some particular risk stratification or therapeutic interventions could have been also excluded. Nevertheless, the high number of recommendations summarized in our study suggest that this could have been only a minor limitation.
Our study will be useful for future preoperative care guideline developers or adapters. Consistently with other overviews of clinical guidelines, the domain that received the lowest mean score was the 'applicability' domain of the AGREE-II tool. Similarly, the heterogeneity of evidence and the strength of recommendations grading systems in this overview echo that of other clinical guideline overviews. [36][37][38] Low scores in the applicability domain result in inadequate adoption rates of guidelines, particularly for preoperative care where 'defensive medicine' (i.e. prescribing more tests than necessary just to prevent litigation) is very common. We also found some discrepancies, mainly in the evidence level, in each recommendation that did not always discriminate between universal interventions and those suitable only for special target groups or specific surgeries.
Guideline developers should ensure rigorous methodological processes and should also make recommendations that are formulated and disseminated in ways that facilitate understanding and application by end-users. For example, the DECIDE Collaboration conducted research and developed tools to improve implementation of evidence-based recommendations by different target audiences, including providers, policy makers, and the public. [39] In that sense, GRADE provides guideline developers with a comprehensive and transparent framework for grading quality of evidence and of strength of recommendations.
Our overview identified several controversies, evidence gaps and issues regarding preoperative care guidelines that warrant future research and reveal opportunities to improve the guidelines quality.
For example, we found many discrepancies about risk stratification recommendations like electrocardiography and Chest X-ray, polysomnography, assessment of left ventricular function, stress testing and coronary angiography in certain populations. We found less discrepancies for therapeutic/preventive care mainly because antimicrobial prophylaxis, use of beta-blockers (Find these discrepancies in the Online supplemental material Appendix 6 and Appendix 8).
From the perspective of the anesthesiologist practice there still remain many unanswered questions.
For example, in the patient with significant medical, surgical or obstetrical history, it would be useful to understand how early the pre-anesthetic evaluation should be performed, considering the time required to optimize the patient's status. There are also uncertainties for the recommendation of fasting for solids in adults and children since many factors can delay gastric emptying, and no fixed rules apply. Fasting should be individualized in some patients, and depending on the characteristics of the fat intake. Regarding prokinetics and antacids, patients' comorbidities like esophageal pathology, bariatric surgery history or obesity, should be considered in the decision, but there is no formal recommendation. In the same way, suspending or not suspending aspirin should be evaluated according to the patient's history and risk of bleeding of the surgery that could be catastrophic in neurosurgery, spinal surgery or ophthalmologic surgery. It is also strange that informed consent only has a 'weak for', recommendation from a unique CPGs since there is enough background of litigation due to the lack of consent.
We encourage guideline developers to adopt GRADE and AGREE-II tools to elaborate future sound preoperative care guidelines. [6,21] The huge amount of resources involving preoperative care warrants high-quality nationwide EB-CPGs supported by all relevant stakeholders to improve the chances of a successful implementation. This probably includes the involvement of the Ministry of Health, scientific societies, and consumers working together through a formal process of implementation and monitoring. [16,40] Although standardization of preoperative care may be desirable, differences in recommendations could reflect differences in contextual factors such as organizational or financial arrangements, legal framework, varied values and preferences, and the acceptability and feasibility of using different interventions. Research exploring reasons for conflicting recommendations in different countries or settings could also drive overall improvements in guideline quality. The key findings are described in Box 1.

Box 1 Key points
The included evidence-based clinical practice guidelines (EB-CPGs) showed significant heterogeneity both of evidence and recommendation grading systems; GRADE was the most commonly used.
About half of the included EB-CPGs were updated in the last five years and one third of them were rated as strongly recommended based in their high AGREE-II performance.
They were generally deficient in applicability and in providing monitoring tools.
We summarized 31 risk stratification and 16 therapeutic/preventive recommendations. We found 93 strong for and 46 strong against recommendations, all which were ready to be considered to be implemented or to be interrupted, respectively.
The level of evidence and strength of recommendation was higher for therapeutic/preventive recommendation than for risk stratification We only found 12/53 (55%) strong risk stratification recommendations based on high/moderate level of evidence and 43/78 (55%) for therapeutic/preventive care recommendations.
In conclusion we found significant heterogeneity of guidelines' quality and rating systems, as well as deficiencies in several guideline quality domains, which reveal opportunities for quality improvement which deserve careful consideration by future guideline developers. Nevertheless, we present many strong recommendations ready to be at present considered for implementation or discontinuation. the acquisition of data. All authors contributed to the analysis and interpretation of data; drafting the article or revising it critically for important intellectual content and provided final approval of the final version submitted.