The risk associated with spinal manipulation: an overview of reviews

Background Spinal manipulative therapy (SMT) is a widely used manual treatment, but many reviews exist with conflicting conclusions about the safety of SMT. We performed an overview of reviews to elucidate and quantify the risk of serious adverse events (SAEs) associated with SMT. Methods We searched five electronic databases from inception to December 8, 2015. We included reviews on any type of studies, patients, and SMT technique. Our primary outcome was SAEs. Quality of the included reviews was assessed using a measurement tool to assess systematic reviews (AMSTAR). Since there were insufficient data for calculating incidence rates of SAEs, we used an alternative approach; the conclusions regarding safety of SMT were extracted for each review, and the communicated opinion were judged by two reviewers independently as safe, harmful, or neutral/unclear. Risk ratios (RRs) of a review communicating that SMT is safe and meeting the requirements for each AMSTAR item, were calculated. Results We identified 283 eligible reviews, but only 118 provided data for synthesis. The most frequently described adverse events (AEs) were stroke, headache, and vertebral artery dissection. Fifty-four reviews (46%) expressed that SMT is safe, 15 (13%) expressed that SMT is harmful, and 49 reviews (42%) were neutral or unclear. Thirteen reviews reported incidence estimates for SAEs, roughly ranging from 1 in 20,000 to 1 in 250,000,000 manipulations. Low methodological quality was present, with a median of 4 of 11 AMSTAR items met (interquartile range, 3 to 6). Reviews meeting the requirements for each of the AMSTAR items (i.e. good internal validity) had a higher chance of expressing that SMT is safe. Conclusions It is currently not possible to provide an overall conclusion about the safety of SMT; however, the types of SAEs reported can indeed be significant, sustaining that some risk is present. High quality research and consistent reporting of AEs and SAEs are needed. Systematic review registration PROSPERO CRD42015030068. Electronic supplementary material The online version of this article (doi:10.1186/s13643-017-0458-y) contains supplementary material, which is available to authorized users.


Background
Spinal manipulative therapy (SMT) is a manual treatment where a vertebral joint is passively moved between the normal range of motion and the limits of its anatomic range, though a universally accepted definition does not seem to exist [1]. SMT often involves a highvelocity, low-amplitude thrust, a technique in which the joints are adjusted rapidly, often accompanied by popping sounds [2,3].
The use of SMT dates back to 400 BCE, but during the centuries, SMT has switched between being accepted and abandoned by the medical profession [4]. Today, SMT is included in many guidelines for primary care, such as the management of non-specific low back pain [5], and several evidence-based guidelines exist on the practice of SMT [6][7][8][9][10]. SMT is widely used; it has been estimated that 12% of adults in the USA and Canada are attending chiropractors each year, with 80% of the visits involving SMT [11,12], and use of SMT has been increasing in the past several decades [13]. Various professional groups are performing SMT including chiropractors, osteopaths and manual therapists [14]. SMT is used for a wide range of diseases and conditions with frequent indications being neck and back pain [13].
As with all interventions, there are risks associated with SMT. Possible harmful outcomes of SMT includes, but are not limited to, headache, radiating discomfort and fatigue [18], which are often transient, but also more serious events such as death, stroke, paralysis and fractures [19][20][21][22]. What the patients define as mild, moderate and major AEs depend on the severity of the pain or symptom, the impact on their function, the duration and by ruling out other causes for the AEs [23]. Currently, the knowledge about the risk of harms associated with SMT is fragmented since an enormous amount of literature exists on the topic, but with different conclusions. For instance, two retrospective population-based studies have suggested an association between vertebrobasilar strokes and chiropractic care (which usually involves spinal manipulation), but also a similar association with primary care physician visits [24,25]. Another study concluded that SMT is independently associated with vertebral artery dissection [26]. Thus, uncertainty arises when single studies are reviewed, and there is a need for an overview of the field. To our knowledge, no one has provided a complete overview of what is known about the safety of SMT. Therefore, we performed an overview of reviews to elucidate and quantify the risk of serious adverse events (SAEs) associated with SMT regardless of the indications for the treatment.

Methods
A brief protocol was registered in the International Prospective Register of Systematic Reviews (PROSPERO: CRD42015030068) prior to the initiation of this overview [see protocol in Additional file 1]. This review was reported according to PRISMA harms [27] [see the completed checklist in Additional file 2].

Literature search
We searched Cochrane Database of Systematic Reviews, Cochrane Database of Abstracts of Reviews of Effects (DARE), Cochrane Health Technology Assessment Database (HTA), MEDLINE via PubMed (from 1966) and EMBASE via Ovid (from 1974). The original search was conducted on December 8, 2015 and updated on January 10, 2017, and no date restrictions were used. Our main search terms consisted of the terms spinal adjustment, chiropractic, and spine -, spinal -, lumbar -, back -, neck -, cervical -, thrust -, or osteopath manipulation, in addition to the MeSH term 'Manipulation, Chiropractic'. Our systematic review filter included the terms Cochrane, CENTRAL, MEDLINE, EMBASE, pubmed, search, systematic review, meta-analysis, comparative effectiveness, indirect -and mixed treatment comparison, and systematic literature [see Additional file 3, showing the search strategy used]. References from relevant reviews, overviews of reviews and relevant national clinical guidelines were checked to identify additional relevant reviews.

Study selection
We included official health technology assessment reports and peer-reviewed reviews of studies of any type (including cohorts, case reports, etc.) that examine individuals receiving SMT. We did not require the SMT to be within a certain definition but relied on the definitions used by the review authors. No restrictions were put on the age, nationality, gender or health status of the population, or length of follow-up of the study. The control could be sham, placebo, any or none. At least an abstract in English, Danish, Swedish or Norwegian had to be available. For inclusion in the synthesis, data on AEs was required.
In order to ensure that the included reviews were conducted in a systematic manner, a criterion for inclusion was to include the following two items from a measurement tool to assess systematic reviews (AMSTAR): 'were two or more electronic sources searched?' and 'was the scientific quality of the included studies assessed and documented?' [28,29], as done by other overview authors [30,31]. Since no commonly accepted quality assessment tool exists for case reports, case series, cross-sectional studies or surveys, quality assessments of these study types were not required.
One reviewer (SMN) screened titles and abstracts, and subsequently reviewed full texts to identify relevant reviews for the overview. A second reviewer (MH) was consulted when the basis for decision making was not clear. We contacted authors of studies that could not be retrieved in full text.

Data extraction
The same reviewer (SMN) performed the data extraction, and the same second reviewer (MH) was consulted, when the basis for decision making was not clear. When possible, we extracted only data for patients receiving SMT, when other interventions were included in a review.
The primary outcome was SAEs defined as conditions requiring hospital admission (or mortality) [32], and the secondary outcome was any AEs reported. AEs were defined as 'any untoward occurrence that may present during treatment' [32]. If the severity of an AE was not defined in the review, one reviewer (MH) rated the severity of the reported AEs, and when the basis for rating was unclear, another reviewer (HB) was consulted. No attempt was made to contact authors of reviews or primary studies to obtain missing data.
It was pre-specified in our protocol that the AEs and SAEs should be summarized for each review with a subsequent synthesis and meta-analysis. However, the available data on AEs and SAEs were too heterogeneously and insufficiently reported. Instead, we appraised the communicated opinions of each review concerning the safety of SMT based on their conclusions regarding the AEs and SAEs. This was done by two reviewers independently (SMN, LK), who judged the communicated opinions as either 'safe' , 'neutral/unclear' or 'harmful' , based on the qualitative impression the reviewers had when reading the conclusions. The reviewers had no opinion about the safety/harmfulness of SMT before commencing the judgements. Cohen's weighted Kappa was calculated for the agreement between the reviewers, with a value of 0.40-0.59 indicating 'fair agreement' , 0.60-0.74 indicating 'good agreement' and ≥0.75 indicating 'excellent agreement' [33]. Disagreements were resolved by a third reviewer (MH).

Quality assessment
One reviewer (SMN) assessed the methodological quality of each review using the AMSTAR tool [28,29]. AMSTAR consists of 11 criteria, where each was given one of the ratings: 'yes' (clearly done), 'can't answer' (unclear if completed), 'no' (clearly not done) or 'not applicable'. A second reviewer (MH) was consulted when the basis for decision making was not clear. We calculated a summary score by awarding each 'yes' with one point for each review [28]. A score of 0-4 is often classified as low quality, 5-8 as moderate quality and 9-11 as high [34].
We did not assess the quality of the evidence presented by each of the reviews. However, if a quality of evidence assessment (such as a GRADE assessment) was reported in the reviews, the approach and result were extracted.

Data analysis
To get an 'objective' measure of our confidence in the subjectively judged communicated opinions, we assessed whether a pattern of communicated opinions could be identified according to methodological quality of the reviews (i.e. AMSTAR). This was done by calculating a risk ratio (RR) of a review communicating the opinion 'safe' when meeting the requirements for each AMSTAR item, and a RR of the opinion of a review communicating 'harmful' when meeting the requirements for each AMSTAR item. The decision to conduct this assessment and subsequent analyses were, however, done post hoc.
Risk estimates for SAEs reported in the reviews are presented in a separate table, and a matrix was constructed showing which studies the estimates from each review were based on. All statistical analyses were performed using the statistical software R, version 3.2.3 (R Foundation for Statistical Computing).

Study selection
The reviewer screened 2305 records and identified 841 potentially eligible records (Fig. 1). Thirteen authors were contacted regarding studies that could not be retrieved in full-text. Twelve authors responded of which 9 were able to provide full-text versions. Reviewing full-texts resulted in 257 records describing 252 reviews eligible for the overview [see Additional file 4 for a list of the excluded reviews]. From reference lists, we further identified 8 records on 6 eligible reviews. In total, 265 records describing 258 reviews were included in the overview [see Additional file 5 for a list of the 258 included studies]; of these, 110 records describing 104 reviews were included in the synthesis. The updated search resulted in screening of 267 additional records, identifying 68 potentially eligible records. Of these, 26 records describing 25 reviews were eligible for the overview, and 15 records describing 14 reviews were included in the synthesis. In total, 283 reviews were included in the overview, of which 118 reviews were included in the synthesis.
The populations most frequently studied were patients with cervical pain, low back pain or headache (based on a word count after categorization by the authors; Table 2). For 81 of the reviews, the main aim was to investigate efficacy (benefit), for 29 of the reviews, the main aim was to investigate AEs, and for the remaining 8, the aim was to investigate both.
A word count of the reported AEs and SAEs showed that the most frequently used term describing AEs/SAEs in the reviews was stroke (counted after categorization by the authors; Table 3). However, it should be noted that a very common subject in the discussion sections was the poor reporting of AEs in the primary studies and the possible risk of underreporting. Thirteen of the reviews reported estimates for the incidence of SAEs, and also here, many of the reviews noted that these were rough estimates [see Table, Additional file 6, which includes conclusions extracted from each reviews].

The methodological quality of included reviews
None of the reviews met the requirements for all 11 AMSTAR items ( Table 4). The median number of 'yes' was 4 (interquartile range, 3 to 6), with a minimum and maximum of 0 and 9 'yes' respectively. Only very few reviews had combined (e.g. in meta-analysis or other means of synthesis) the findings of AEs and SAEs or done this in an appropriate way; hence, item 9 was not applicable in most cases. One of the reviews made an attempt to assess the publication bias specifically for AEs   Furthermore, very few reviews rated the quality of the evidence for AEs and/or SAEs, with GRADE being the most frequently used tool.

Serious adverse events
The estimates for the incidence of SAEs (Table 5) were heterogeneous, as they had different units (e.g. per number of manipulations, per visits or no unit), were based on different patient types, and were obtained from different types of studies [see Table, Additional file 7, showing which studies the estimates for the incidence of SAEs are based on].
When not distinguishing between the different types of SMT treatments and assuming that one treatment or visit equals one manipulation, and leaving out the minority of estimates not specifying the units or using per patient as the unit, the estimates for the incidence of SAEs ranges from 1 in 20,000 manipulations to 1 in 250,000,000 manipulations (Table 6). Based on the conclusions of the reviews regarding AEs and SAEs, 54 reviews (46%) expressed that SMT is safe, 15 (13%) expressed that SMT is harmful and 49 reviews (42%) were neutral or unclear regarding the safety of SMT, with a fair agreement between the two reviewers (Cohens Weighted Kappa, 0.50).
The calculations of RRs show a higher chance of a review communicating that SMT is safe, when having a higher methodological quality, compared to reviews of lower methodological quality (statistically significant for the AMSTAR items 5, 7 and 8; Table 7). And vice versa, there is a lower chance of a review communicating that SMT is harmful, when it has a lower methodological quality.

Reviews specifically investigating adverse events
When only considering the subset of reviews, where the objective was to investigate AEs (37 reviews), then 8 reviews (22%) expressed that SMT is safe, 13 reviews (35%) expressed that SMT is harmful and 16 reviews (43%) were neutral or unclear regarding the safety of SMT. Hence, there is a tendency that a bigger proportion of these reviews are expressing that SMT is harmful compared to the full sample of reviews. The calculations of RRs did not obtain enough power to show any statistically significant RRs [see Table, Additional file 8, which shows the calculations of RRs]. The possibility of a causal relationship between SMT and SAEs was specifically investigated in six of the included reviews [89,90,118,124,127,133] (Table 8). Five of these had for each case report or case series assessed the likelihood of causality [89,90,118,124,133]. In all cases, 'certain' was not the single most used rating. Miley et al. [127] used another approach and concluded weak to moderate strength of evidence for a causal relationship between cervical SMT and vertebral artery dissection, and expressed that comprehensive prospective studies are needed to further examine this relationship.

Discussion
In this overview, the included reviews did not provide sufficient data for synthesis, and therefore it is currently not possible to provide an overall estimate for the risk of SAEs associated with SMT. Of the few reviews providing estimates for the incidence of SAEs, no reliable single estimate was provided, and it was not possible to identify any agreement regarding the safety of SMT across the included reviews. Interestingly, we found indications that reviews with higher methodological quality generally used language suggesting SMT to be safer (or less harmful). However, when analysing this across the reviews whose objective was to investigate safety, this could not Table 2 The patient populations most frequently studied in the included reviews (listed after frequency shown in brackets) Table 3 The terms describing the adverse events and serious adverse events most frequently used in the reviews (listed after frequency shown in brackets)  be replicated. In the few reviews assessing the likelihood of a causal relationship between SMT and SAEs, this relationship was not in all cases certain. However, it should be noted that these assessments were based on case reports and case series, which cannot determine causality. This overview is to our knowledge, the most comprehensive overview conducted on SMT, by including more than 100 reviews on SMT, and the only one with a sole focus on the safety aspects of SMT. Our intention was to provide an overview of all SAEs from SMT regardless of the indications for the treatment, but our overview especially covers patients with cervical pain, low back pain and headache, which were the most frequently studied populations. The most frequently mentioned AEs/SAEs across the 118 reviews ranged from minor events, such as soreness, to significant events, such as spinal cord injury and death. While some of these events may to a Table 4 Methodological quality of included reviews assessed with AMSTAR (Continued) 2006 Snelling N. J. [135] No  From a SR: 1 additional disc herniation or CES in 3.7 million manipulations (in pts, with lumbar disc herniation).
They compare the incidence rates with NSAID consumption (0.39-3.2 serious gastrointestinal event in 1000 subjects) and cervical spine surgery (15.6 neurologic complications (spinal cord or nerve root injury, recurrent laryngeal nerve palsy, dural leak, and injury to cervical sympathetic nerve trunk (Horner's syndrome)) in 1000 surgeries and 6.9 deaths in 1000 surgeries.
large extent be unpredictable [155] and have major impact on not only the individual but also the SMT provider and society, it is not possible to ascertain the riskbenefit balance based on the current evidence [156]. We strongly encourage efforts to illuminate the risk-benefit ratio reliably, since this would be of value when comparing SMT with other treatment options. Some of our included reviews indicate that NSAIDs involve a substantially higher risk of SAEs (including death) than SMT [114,150], but they did not take into account the possible benefits.
General limitations in overviews are that recently published primary studies or studies not included in reviews cannot be included, the included reviews may overlap, and that the overviews rely on the methodological quality of the included reviews, which again rely on the methodological quality of the primary studies [157]. Considering the low methodological quality of the included reviews, the communicated opinions could possibly be influenced by the background of the authors [158], and by lack of independence between the reviews, i.e. several reviews were written by the same author. A major limitation of this overview was the limited data on AEs and SAEs hindering a synthesis. On the level of reviews, poor reporting of AEs is present [159]; however, even high quality reviews may fail to provide reliable estimates due to poor reporting in the primary studies, and this was frequently highlighted in the discussions of the included reviews. In primary studies, underreporting may be expected for retrospective studies or poorly controlled prospective studies. Including only RCTs would provide an insufficient population size for detecting SAEs reliably, and it has been shown that even in RCTs, AEs and SAEs are poorly reported [126,160] and underreported [96,161]. Gorrell et al. [162] found that out of 368 RCTs on SMT, only 140 (38%) reported on AEs. This under-reporting will directly affect the reviews including the studies resulting in a underestimation of the risk. On the other hand, over-reporting may be present, since the different study types (ranging from case reports to RCTs) provide various levels of evidence, and therefore confounding and chance cannot be ruled out as possible explaining factors for some of the observed SAEs associated with SMT.
Our methodological approach has limitations too. Our inclusion criteria were slightly heterogeneous across reviews. We relied on the definitions of SMT  [114] Their own summarisation: 0.5-2 strokes in one million cervical manipulations performed, 1 serious vascular complication in 100.000 patients who undergo a course of treatment (10-15 sessions of cervical manipulation over the course of a year) with cervical manipulation, or 0.001%, 1 death in 400.000 pts. treated, or an 'overall death rate of 0.0025% per course of treatment for patients with neck pain who are treated with cervical manipulation.' They compare this with a risk of 0.4% for getting serious gastrointestinal ulcers requiring hospitalization because of NSAID use, and a risk of 0.04% for death from gastrointestinal bleeding caused by NSAID use.
Their own calculation based on insurance company data: <1 stroke in 2 million cervical manipulations. From surveys: 1 serious complication in 400.000 cervical manipulations (no reported deaths), 1 complication in 518.000 manipulations, 1 stroke in 500.000 cervical manipulations, no serious incidence in >500.000 manipulations, 2-3 'more-or-less serious incidents' in one million treatments. From reports: no vertebral artery injury or stroke in 5 million cervical manipulations, no significant complications in 168.000 cervical manipulations. From a review: 1-2 strokes in one million manipulations.
CC case-control study, CES cauda equina syndrome, CMT cervical manipulative therapy, CVA cerebrovascular accident, LDH lumbar disc herniation, NSAID nonsteroidal anti-inflammatory drug, pCohort prospective cohort study, RCT randomized controlled trial, SAE serious adverse event, SMT spinal manipulative therapy, SR systematic review, VAD vertebral artery dissection, VBA vertebrobasilar accident used by the review authors, which varied between the reviews. Some of the reviews mixed SMT with other interventions under a common category such as 'manual treatment' or 'manipulation' without reporting on only the SMT subgroup. Even when authors describe interventions such as SMT, these may not always include high-velocity, low-amplitude thrusts. In that case, the intervention is less likely to result in SAEs and may influence their and our conclusion about safety by making (high-velocity, low-amplitude thrust-type) SMT appear more safe. Further, we did not require a quality assessment to have been conducted for case reports, case series, cross-sectional studies and surveys, which may have facilitated the inclusion of reviews including only these types of studies. Our judgements regarding the expressed opinions in the reviews were not based on any criteria but based on subjective interpretation and therefore not reproducible even though there was fair agreement between the reviewers. Other limitations include the absence of a double study selection, data extraction and quality assessment, and a very brief protocol. These methodological compromises were taken due to limited time resources. However, our search strategy was broad, and we applied a thorough study selection making us confident that we have identified the vast majority of the relevant scientific literature on SMT and we find it unlikely that more thorough study selection and extraction procedures would result in different conclusions.

Conclusions
This overview has indeed demonstrated how extensive the literature on SMT is. Unfortunately, the majority of reviews are non-systematic and of poor quality. The available evidence showed a broad range of communicated opinions and very variable estimates of SAE incidence. Reviews with less methodological flaws typically communicated that SMT may be safe; however, the methodological quality was in general low and the included reviews very heterogeneous. Furthermore, for the subset of reviews whose objective was to investigate safety, this could not be replicated. Research of high quality, with sufficient sample size and an appropriate comparison group is needed to obtain reliable risk estimates. Furthermore, reviews suggested that a causal relationship between SMT and SAEs was

Funding
This work was supported by the Association of Danish Physiotherapists and by The Oak Foundation. The Parker Institute, Bispebjerg and Frederiksberg Hospital is supported by a core grant from the Oak Foundation (OCAY-13-309). The funders had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication.

Availability of data and materials
The majority of the data generated and/or analysed during this study are included in this published article and its supplementary information files; the remaining data are available from the corresponding author on reasonable request.
Authors' contributions SMN, ST, RC, HB and MH contributed to the design of this overview. SMN performed the study selection, data extraction, and risk of bias assessment, assisted by MH and LK. SMN, MH and RC analysed and interpreted the data. SMN wrote first draft of the paper. All authors have read and approved the final manuscript.
Competing interests MH is a member of the Association of Danish Physiotherapists that could benefit from this publication; no other relationships or activities that could appear to have influenced the submitted work.

Consent for publication
Not applicable.
Ethics approval and consent to participate Not applicable.