Surgical or non-surgical treatment of traumatic skeletal fractures in adults: systematic review and meta-analysis of benefits and harms

Background A comprehensive overview of treatments of common fractures is missing, although it would be important for shared decision-making in clinical practice. The aim was to determine benefits and harms of surgical compared to non-surgical treatments for traumatic skeletal fractures. Methods We searched Medline, Embase, CINAHL, Web of Science, and CENTRAL until November 2018, for randomized trials of surgical treatment in comparison with or in addition to non-surgical treatment of fractures in adults. For harms, only trials with patient enrollment in 2000 or later were included, while no time restriction was applied to benefits. Two reviewers independently assessed studies for inclusion, extracted data from full-text trials, and performed risk of bias assessment. Outcomes were self-reported pain, function, and quality of life, and serious adverse events (SAEs). Random effects model (Hedges’ g) was used. Results Out of 28375 records screened, we included 61 trials and performed meta-analysis on 12 fracture types in 11 sites: calcaneus, clavicula, femur, humerus, malleolus, metacarpus, metatarsus, radius, rib, scaphoideum, and thoraco-lumbar spine. Seven other fracture types only had one trial available. For distal radius fractures, the standardized mean difference (SMD) was 0.31 (95% CI 0.10 to 0.53, n = 378 participants) for function, favoring surgery, however, with greater risk of SAEs (RR = 3.10 (1.42 to 6.77), n = 436). For displaced intra-articular calcaneus fractures, SMD was 0.64 (0.13 to 1.16) for function (n = 244) and 0.19 (0.01 to 0.36) for quality of life (n = 506) favoring surgery. Surgery was associated with a smaller risk of SAE than non-surgical treatment for displaced midshaft clavicular fractures (RR = 0.62 (0.42 to 0.92), n = 1394). None of the other comparisons showed statistical significance differences and insufficient data existed for most of the common fracture types. Conclusions Of 12 fracture types with more than one trial, only two demonstrated a difference in favor of surgery (distal radius fractures and displaced intra-articular calcaneus fractures), one of which demonstrated a greater risk of harms in the surgical group (distal radius fractures). Our results highlight the current paucity of high-quality randomized trials for common fracture types and a considerable heterogeneity and risk of bias in several of the available trials. Systematic review registration PROSPERO CRD42015020805


Methods
This report conforms to the PRISMA statement [10]. The study followed the published guidelines on systematic reviews from the Cochrane Collaboration [11] and it was pre-registered with PROSPERO (CRD42015020805). In the PROSPERO-registration, two systematic reviews are described, the other being a systematic review of surgical vs. non-surgical treatment of non-fracture musculoskeletal conditions, which will be reported in a subsequent publication.

Search strategy
Two authors (STS + CBT) searched MEDLINE via PubMed, EMBASE via Ovid, CINAHL (including preCI-NAHL) via EBSCO, Web of Science via Web of Knowledge and CENTRAL, all up to 5 November 2018. We included trials reported in English, German, Danish, Swedish, and Norwegian (i.e., languages that the authors understand). For SAEs, only trials enrolling patients from 2000 were included due to the increasing quality of surgery and anesthesia and with the expectation of improved reporting of SAEs following the CONSORT statement published in 1996 and updated in 2001. No time restriction was applied for benefits. The search strategies were adjusted according to the specifications of the individual database (see Additional file S1). Reference lists of included articles and the most recent systematic reviews were reviewed to identify additional trials.

Trial selection
Two authors (STS + CBJ) independently assessed titles/ abstracts for trial eligibility using a priori selection criteria. The full text was retrieved if found eligible by at least one reviewer. The same authors independently evaluated eligibility of the retrieved full-text trials. Consensus was reached by discussion.
We included randomized trials conducted in any setting evaluating the effect of surgical treatment in comparison or in addition to non-surgical treatment of traumatic fractures in adults (mean age of trial participants 18+) with data on patient-reported pain, physical function, quality of life or SAEs. If any of these outcomes were reported, with data available that could be used in a meta-analysis, the trial was included. Surgery was pre-defined as any procedure that both changes the anatomy and requires a skin incision or use of an endoscopic technique [12], while non-surgical treatment was defined as all non-surgical treatments and placebo treatments.
Trials investigating the effects of drug substances used perioperatively, vertebroplasty, and kyphoplasty, cancer-related fractures, and jaw fractures were excluded. Conference abstracts were also excluded.

Outcomes
Our pre-defined outcomes of interest for benefit were patient-reported pain, physical function, and quality of life, and SAEs for harm. If more than one outcome was available for patient-reported pain, physical function, and quality of life, multidimensional outcomes were preferred before unidimensional outcomes. For unidimensional pain, pain intensity in the activity was preferred over pain intensity in rest. We pre-defined SAEs using the U.S. Food and Drug Administration definition, as all adverse events having the potential to significantly compromise the clinical outcome, result in significant disability or incapacity, requiring inpatient or outpatient hospital care, and those considered to prolong hospital care, to be life-threatening, or to result in death [13]. Non-unions were considered as SAE, while mal-unions were only considered as SAE if this resulted in additional treatment or significant disability or pain. Minor additional surgery such as removal of Kirschner wires was not considered an SAE, if they were part of normal clinical practice following the specific surgical procedure. Crossovers from non-surgical to surgical treatment were not considered an SAE unless caused by an SAE.

Data extraction
A customized data extraction form was developed for the outcomes, and two authors (STS + CBJ) independently extracted data. We preferred data from the 12 months follow-up of the trials, as this is a very common primary endpoint in trials of orthopedic surgery and as benefits from surgical and non-surgical treatment are expected to be stable at that time point. If data was not available from a 12-month follow-up, data from the follow-up closest to 12 months was used. We extracted the number of patients randomized to each treatment, age, sex, study location (country), pain, and BMI at baseline, fracture type, surgical and non-surgical intervention, follow-up time, number of patients not undergoing surgery in the surgical group, number of crossover to surgical treatment, number of patients analyzed, mean effect and SD, deaths and SAEs during follow-up and types of SAEs. If SAEs, deaths, or crossover were not mentioned, it was considered as if it had not occurred.

Risk of bias assessment
Risk of bias was assessed using the Risk of Bias 2.0 tool from the Cochrane Collaboration on trials with results on benefits [14]. Two authors (STS + CBJ) independently assessed if each of the following five domains was associated with low risk of bias, some concerns or high risk of bias: (1) bias arising from the randomization process, (2) bias due to deviations from intended interventions, (3) bias due to missing outcome data, (4) bias in measurement of the outcome, (5) bias in selection of the reported result. If four or five of the individual domains were found to be associated with some concerns of risk of bias, or if one of them was associated with a high risk of bias, the overall risk of bias was rated as high risk.
For SAEs (including death) trial quality was assessed independently on trials with results on SAEs by two authors (STS + CBJ) using the 15-point McMaster tool for assessing quality of harms assessment and reporting in study reports (McHarm) [15]. A score greater than 9 was considered a high score and indicative of low risk of bias.
Any discrepancies in the assessment of trial quality were resolved by discussion.

Data synthesis and statistical methods
The benefits of surgery were estimated using metaanalyses as the standardized mean difference (SMD) allowing for pooling the various outcomes assessed in the individual trials. The SMD was estimated as the difference in mean at follow-up in the intervention and control groups divided by the pooled SD. If the SD was not available it was estimated from the standard error, confidence interval, or the P value, as recommended in the Cochrane Handbook [11]. If necessary, means and measures of dispersion were estimated from figures in the included trials. If only SD of the baseline score and SD of the change score were available, these were used for estimating SD of the final score [11]. SMD was adjusted to Hedges' g, as Cohen's d overestimate the effect in small studies. The SMD was interpreted clinically as originally proposed by Cohen [16], i.e., a SMD of 0.2 was small, a SMD of 0.5 was moderate, and a SMD of 0.8 was large. Heterogeneity was estimated as between-study variance (tau 2 ) and I-squared measuring the proportion of variation (i.e., inconsistency) in the combined estimates due to between-study variance. When I-squared is 0%, no inconsistency is seen between results of individual trials and inconsistency is maximal when I-squared is 100%.
SAEs were calculated as relative risk (RR). In order to handle null findings in either intervention or control group, Battaglias code was imputed. Battaglias code imputes one event distributed according to the numbers in the intervention and control group. The analyses of deaths followed the same approach. Results of individual studies were summed using a random-effects model meta-analysis for studies with relevant data on any of the outcomes, separated based on fracture type, body site, and outcome. While at least two studies were required to conduct meta-analyses on the different fracture types, all studies adhering to the eligibility criteria were included in the systematic review.
A p value less than 0.05 (two-sided) was considered significant. Analyses were carried out in Stata 15 (Stata-Corp, College Station, TX, USA).
Out       , displaced intra-articular calcaneus (n = 6), scaphoid waist (n = 6), and proximal humerus (n = 6) fractures were the fractures most commonly investigated. Trials were carried out across 24 different countries, with the UK (n = 11), Sweden (n = 9), and the USA (n = 6) being the most common. Age and gender distribution varied depending on the fracture type. Table 1 presents the characteristics of the included trials.
As only one trial with relevant data was available for humeral shaft, malleolar (trimalleolar, unstable (uni-bior trimalleolar), stable lateral malleolar), tibia (shaft), and ulnar (olecranon and shaft) fractures, respectively, only 12 fracture types in 11 body sites were evaluated in meta-analyses. See Figs. 2, 3, 4, and 5 for the number of trials and patients included in the meta-analyses within each of the fracture types for each of the outcomes.

Synthesis of results
The results of the meta-analytic syntheses for each of the fracture types separately are presented in Fig. 2 (pain), Fig. 3 (function), and in Fig. 4 (quality of life). Additional file S2 presents the full forest plots for all comparisons.
One trial on trimalleolar ankle fractures (n = 65) [52] and one trial on tibial shaft fractures (n = 53) [75] also demonstrated a significant effect for function in favor of surgery.   risk of bias, mainly due to the lack of possibility to blind patients and treatment providers, and lack of preregistration of the trial in a public trial registry before enrolment of the first patient.

Synthesis of results
The syntheses of the results are presented in Fig. 5 (SAEs), and in Additional file S2 (deaths and the full forest plot for SAEs).
One trial on unstable malleolar fractures (n = 592) [50] and one trial on humeral shaft fractures (n = 96) [42] demonstrated fewer SAEs in the surgical compared to the non-surgical group.
There were no differences between surgical and nonsurgical treatment in the risk of death for any of the fracture types.

Risk of bias
Additional file S3 presents the risk of bias assessment for the individual trials.
Overall, the risk of bias associated with the assessment and reporting of SAEs and death was moderate to high. Only two trials [20,53] had a score greater than 9 indicating a low risk of bias.

Discussion
We found a difference in function in favor of surgery (moderate effect) for displaced intraarticular calcaneal fractures (however with large heterogeneity due to a small (n = 30), old study) and distal radial fractures (small effect), however, with increased risk of SAEs after surgery for radial fractures. No difference in effect was demonstrated for displaced midshaft clavicular fractures and proximal humeral fractures, scaphoid waist, and thoracolumbar traumatic compression fractures, while surgery for clavicular fractures was associated with reduced risk of SAE. Insufficient data existed for all other fracture types.
The large inconsistency and often missing reporting of SAEs and death in the included trials represent a limitation of our study. The lack of consensus in terms and definitions of complications after treatment of fractures calls for the development and validation of a core set of complications [81]. Another potential limitation of this study relates to our selection of outcomes, as 39 trials were excluded due to insufficient data. Some of the trials had selected composite scores of, e.g., pain and function or other outcomes like time to healing of the fracture, while others did not report data that could be included in meta-analyses, e.g., by reporting pain evaluated on a 5point Likert scale. For feasibility reasons, we excluded trials that were not in languages understood by any of the authors, which could be a potential bias. However, as only two trials were excluded based on this criterion, the expected impact on the results is considered minimal. Finally, from a clinical point of view, it is common to decide on whether to recommend surgery or not based not only on the fracture type, but also on patient characteristics such as age, work status, and symptom severity. In pragmatic trials, patients are more commonly included without accounting for patient characteristics, which thereby can potentially affect the generalizability of the results from the individual meta-analyses of this study [63]. Although our results could indicate that non-surgical treatment is as effective as surgical treatment for several traumatic fractures in adults, including displaced midshaft clavicular, proximal humeral, scaphoid waist, and thoracolumbar traumatic compression fractures, serious caveats relating to the number of patients studied, heterogeneity and study methodology question the confidence in such a suggestion. First, only 7/19 fracture types had been scrutinized in at least 2 trials with at least 100 patients totally. Second, few and underpowered studies for some fracture types might be part of the explanation for our findings [82], as a previous study found a mean overall study power (1-beta) among 117 trials of traumatic skeletal fractures of 25% [83]. Third, none of the included trials were associated with a low risk of bias for benefits, and only 2/44 (5%) trials were associated with a low risk of bias for SAEs, confirming a previous study summarizing orthopedic trials [82]. In fact, 17/52 (33%) of the trials with data on benefits were associated with a high risk of bias. Finally, the studied fracture types only represent selected types of fractures in selected types of patients. For some fractures (e.g., clavicular and stable lateral malleolar fractures), the natural history of healing without surgical treatment has a good prognosis [84][85][86]. However, in older persons with lower expectations of function with, e.g., a distal radius or malleolar fracture and more osteoporotic bone, the expected beneficial effect   Abbaszadegan, 1990 Some concern Some concern Low risk Some concern Some concern High risk Agren, 2013 Low risk Low risk Low risk Some concern Some concern Some concern Ahrens, 2017 Low risk Some concern Low risk Some concern Some concern Some concern Arora, 2007 Some concern Low risk Some concern Some concern Some concern High risk Arora, 2011 Low risk Low risk Low risk Some concern Some concern Some concern Azzopardi, 2005 Some concern Low risk Some concern Some concern Some concern High risk Boons, 2012 Low risk Low risk Low risk Some concern Some concern Some concern Buckley, 2002 Low risk Low risk High risk Some concern Some concern High risk Chen, 2011c Some concern Low risk Low risk Some concern Some concern Some concern Clementson, 2015 Low risk Some concern High risk Some concern Some concern High risk Dias, 2005 Low risk Low risk Some concern Some concern Some concern Some concern Duckworth, 2017 Low risk Low risk Some concern Some concern Some concern Some concern Fjalestad, 2014 Low risk Low risk Low risk Some concern Some concern Some concern Földhazy, 2010 Low risk Some concern Some concern Some concern Some concern High risk Griffin, 2014 Low risk Low risk Low risk Some concern Low risk Some concern Hussain, 2017 Some concern Some concern Some concern Some concern Some concern High risk Ibrahim, 2007 High risk Some concern High risk Some concern Some concern High risk Judd, 2009 Low risk Low risk Some concern Some concern Some concern Some concern Karladani, 2000 Some concern High risk Some concern Some concern Some concern High risk Koch, 2008 Some concern Some concern Some concern Some concern Some concern High risk Kreder, 2006 Low risk Low risk Some concern Some concern Some concern Some concern Kumar, 2018 High risk Some concern Some concern Some concern Some concern High risk Lee, 2016 Some concern Some concern Low risk Some concern Some concern High risk Makwana, 2001 Low risk Some concern Some concern Some concern Some concern High risk Marasco, 2013 Low risk Low risk Low risk Some concern Some concern Some concern Matsunaga, 2017 Low risk Some concern Some concern Some concern Low risk Some concern McKee, 2007 Low risk Low risk Some concern Some concern Some concern Some concern Mirzatolooei, 2011 Low risk Some concern Some concern Some concern Some concern High risk Mittal, 2017 Low risk Low risk Some concern Some concern Low risk Some concern Nouraei, 2011 Some concern Low risk Some concern Some concern Some concern High risk Olerud, 2011a Low risk Low risk Low risk Some concern Some concern Some concern from surgical treatment is typically less than in younger more physically active patients. Thus, some of the studies included represent fracture types suspected to have limited benefits in terms of pain, function, and quality of life from surgical treatment. Other fracture types more obviously in need of surgery (displaced lower arm or hip fractures) is less likely to be subjected to randomization to non-surgical treatment; often termed parachute trials [87]. Despite the mentioned limitations of the SAE reporting, some interesting findings are worth mentioning as our study presents the first overview of SAEs across RCTs of different fractures. While the risk of SAEs was lower from surgical treatment in displaced midshaft clavicular fracture, it was higher in distal radius fractures, and no difference was present for the other six comparisons with the estimated relative risk of SAEs distributed relatively even on both sides of the "no Olerud, 2011b Low risk Low risk Low risk Some concern Some concern Some concern Piazzolla, 2011 Some concern Low risk Low risk Some concern Some concern Some concern Qvist, 2018 Low risk Low risk Some concern Some concern Low risk Some concern Rangan, 2015 Low risk Some concern Low risk Some concern Low risk Some concern Robinson, 2013 Low risk Some concern Low risk Some concern Some concern Some concern Salai, 2000 High risk High risk Some concern Some concern Some concern High risk Sanders, 2012 Low risk Low risk Low risk Some concern Some concern Some concern Shen, 2001 Some concern High risk Some concern Some concern Some concern High risk Siebenga, 2006 Some concern Low risk Low risk Some concern Some concern Some concern Sletten, 2015 Low risk Low risk Low risk Some concern Low risk Some concern Smekal, 2009 Low risk Low risk Low risk Some concern Some concern Some concern Tamaoki, 2017 Low risk Low risk Some concern Some concern Some concern Some concern Thordarson, 1996 Low risk Low risk Some concern Some concern Some concern Some concern Vinnars, 2008 Low risk Low risk Low risk Some concern Some concern Some concern Virtanen, 2012 Low risk Low risk Some concern Some concern Some concern Some concern Willet, 2016 Low risk Low risk Low risk Some concern Low risk Some concern Woltz, 2017 Low risk Some concern Some concern Some concern Low risk Some concern Wong, 2010 Low risk Low risk Some concern Some concern Some concern Some concern Wood, 2003 Low risk Low risk Some concern Some concern Some concern Some concern Wu, 2018 Low risk Low risk Low risk Some concern Some concern Some concern Zyto, 1997 Low risk Low risk Some concern Some concern Some concern Some concern Study quality was assessed for risk of bias using the Risk of Bias 2.0 tool from the Cochrane Collaboration on trials with results on patient-reported pain, physical function, and/or quality of life [14]. If four or five of the individual domains was found to be associated with some concerns of risk of bias, or if one of them was associated with high risk of bias, the overall risk of bias was rated as high risk difference in risk" line, dependent on the fracture type. Importantly, most of the findings were based on 2-3 studies, including few patients, precluding any firm conclusions. However, our results do suggest that for some of the more often studied fracture types, like displaced midshaft clavicular fractures, distal radius fractures in older patients, proximal humerus fractures, and traumatic thoraco-lumbar compression fractures, non-surgical treatment might serve as an equally effective and safe treatment as surgical treatment.
Only 20% of the most commonly performed orthopedic procedures, including surgery for fractures, are supported by at least one low risk of bias trial [88]. A search of trials of surgical and non-surgical treatment of fractures in the WHO International Clinical Trials Registry Platform [89] indicates that several ongoing trials will provide data to help build the evidence base for optimal treatment of fractures. Our study is a call to action for more low-risk-of-bias trials powered to detect any difference in benefits and harms between surgical and non-surgical treatment of the most common traumatic skeletal fractures in adults. Although such studies are known to be challenging [90], they are crucial to improve the clinical care of the patients.

Conclusion
Of 12 fracture types with data from more than one trial, only two demonstrated a difference in function in favor of surgery (moderate effect for displaced intraarticular calcaneal fractures, although affected by a large heterogeneity, and small effect for distal radial fractures), but with greater risk of harms after surgery for radial fractures. We found no difference in effect for displaced midshaft clavicular fractures, proximal humeral fractures, scaphoid waist, and thoracolumbar traumatic compression fractures, while surgery for clavicular fractures was associated with a reduced risk of SAE. Our results also highlight the current paucity of high-quality randomized trials for other common fracture types and a considerable heterogeneity for some of the estimates and risk of bias in a large proportion of available trials.
Additional file 1: S1. Search strategy for Medline. S2. Assessment of quality of harms assessment and reporting of included trials of surgical and non-surgical treatment of fractures. S3. Full forest plots for all comparisons, including deaths.
Abbreviations SAE: Serious adverse event; BMI: Body mass index; SD: Standard deviation; SMD: Standardized mean difference; RR: Relative risk