Comparative effectiveness of immunosuppressive drugs and corticosteroids for lupus nephritis: a systematic review and network meta-analysis

Background There is a lack of high-quality meta-analyses and network meta-analyses of immunosuppressive drugs for lupus nephritis. Our objective was to assess the comparative benefits and harms of immunosuppressive drugs and corticosteroids in lupus nephritis. Methods We conducted a systematic review and network meta-analysis (NMA) of trials of immunosuppressive drugs and corticosteroids in patients with lupus nephritis. We calculated odds ratios (OR) and 95 % credible intervals (CrI). Results Sixty-five studies that met inclusion and exclusion criteria; data were analyzed for renal remission/response (37 trials; 2697 patients), renal relapse/flare (13 studies; 1108 patients), amenorrhea/ovarian failure (eight trials; 839 patients) and cytopenia (16 trials; 2257 patients). Cyclophosphamide [CYC] low dose (LD) and CYC high-dose (HD) were less likely than mycophenolate mofetil [MMF] and azathioprine [AZA], CYC LD, CYC HD and plasmapharesis less likely than cyclosporine [CSA] to achieve renal remission/response. Tacrolimus [TAC] was more likely than CYC LD to achieve renal remission/response. MMF and CYC were associated with a lower odds of renal relapse/flare compared to PRED and MMF was associated with a lower rate of renal relapse/flare than AZA. CYC was more likely than MMF and PRED to be associated with amenorrhea/ovarian failure. Compared to MMF, CYC, AZA, CYC LD, and CYC HD were associated with a higher risk of cytopenia. Conclusions In this systematic review and NMA, we found important differences between immunosuppressives used for the treatment of lupus nephritis. Patients and physicians can use this information for detailed informed consent in a patient-centered approach. Study limitations of between-study clinical heterogeneity and small sample size with type II error must be considered when interpreting these findings. Systematic review registration PROSPERO: CRD42016032965 Electronic supplementary material The online version of this article (doi:10.1186/s13643-016-0328-z) contains supplementary material, which is available to authorized users.


Background
Lupus is a chronic autoimmune disease that frequently involves the kidney. Lupus nephritis can lead to kidney failure, dialysis, and even premature death if not treated appropriately. Lupus primarily affects young women and is more common and more severe in racial/ethnic minorities, who experience worse outcomes [1][2][3][4][5].
Comparative effectiveness research (CER) of drugs used to treat lupus nephritis is an imperative [6]. Patients, facing difficult decisions related to their treatment options, such as those related to life-and/or organthreatening clinical situations (active lupus nephritis, for example) need information about possible harms and benefits of available treatment options in a format that provides comparisons of multiple treatment options.
The 2012 American College of Rheumatology (ACR) lupus nephritis treatment guidelines [7] and the Cochrane systematic review of interventions for lupus nephritis [8] assessed literature up to 2010 and 2012, respectively. However, indirect comparisons were not performed in either. Few lupus nephritis treatments have been compared directly in clinical trials. This leaves a large knowledge gap. Clinicians and patients have to choose between various immunosuppressive drugs in the absence of such knowledge. Therefore, we need evidence synthesis using valid methods to incorporate indirect and direct comparisons of efficacy/harms of these treatments.
A state-of-the-art network meta-analysis (NMA), with updated information, is a necessary precursor to the development of a clinical decision-making tool for physician and their patients with lupus nephritis. This information can be very helpful to patients during the treatment decision-making process for new disease, disease flare or refractory disease. It is not surprising that for a rare condition such as lupus with roughly 161,000 patients in the USA [9], most multicenter trials have <500 patients (often 50-200 patients). This makes many trials underpowered for assessing treatment-related differences in disease outcomes.
One useful approach is the use of composite outcomes, which have been widely used to address important clinical questions in obstetrics, cardiology, and other disciplines [10][11][12]. Assessments of treatment options using composite outcomes can help answer important question in a timely fashion without requiring studies with large sample sizes. This study aimed to perform a systematic review and NMA to compare benefits and harms of immunosuppressive drugs compared to each other and to corticosteroids focusing on four composite benefit/harm outcomes: (1) renal remission/response; (2) renal relapse/flare; (3) ovarian failure/amenorrhea; and (4) bone marrow toxicity.

Methods
We used rigorous methods for the systematic review and NMA based on the Agency for Healthcare Research and Quality (AHRQ) recommendations [13], the Cochrane handbook [14], and the PRISMA guidelines. The Institutional Review Board at the University of Alabama at Birmingham (UAB) approved the study. The need for informed consent was waived for this systematic review, since no human subjects were involved. The study protocol was registered in PROSPERO, CRD42016032965 (http://www.crd.york.ac.uk/PROSPERO/). , Belimumab studies could not be included in this systematic review since these studies included patients with lupus, and only a small proportion had active lupus nephritis. A Cochrane systematic review of belimumab for lupus in underway [15]. There were no restrictions with regards to the medication dose or the duration of medication use.
Experienced librarians (JJ and TR) updated two systematic reviews [7,8]  Raw data abstracted for the ACR lupus nephritis guidelines systematic review [7] were obtained (courtesy Dr. Jennifer Grossman (JG), see acknowledgment section), or were abstracted from the Revman tables of the Cochrane Systematic Review [8]. A librarian (CH) also performed a search for all lupus trials for harms (for conditions other than lupus nephritis) in PubMed and SCOPUS from inception to February 2014, based on an a priori assumption that treatmentrelated harms may not depend on whether kidney is involved or not. Examination of data from this search revealed little additive data for harms, but added clinical heterogeneity related to differences in patient population. Therefore, after careful consideration of pros and cons, we decided not to use these data in analyses.
The PICO (patient, intervention, comparator, outcome) for our systematic review and NMA were defined as follows: P: Adults 18 years or older, meeting the 1987 American College of Rheumatology Classification criteria for systemic lupus erythematosus (SLE) [16], who have lupus nephritis.
I: Immunosuppressant drug alone or in combination with other immunosuppressant drugs or biologics (such as rituximab) or corticosteroids. Medication doses were categorized as low, standard or high dose/ duration (LD, SD and HD). C: Placebo or another immunosuppressive with/ without biologic. O: Benefit and harm outcomes (renal remission/ response, renal relapse/flare, fertility, bone marrow suppression), defined as follows.
Two trained abstractors (AO, AB) independently reviewed abstracts and titles, abstracted data in duplicate directly into Microsoft excel sheets and assessed the risk of bias according to the Cochrane risk of bias tool [21]. We examined the following domains as low or high risk of bias or unclear risk (lack of information or uncertainty about potential for bias): randomization sequence generation, allocation sequence concealment, blinding of participants, personnel and outcome assessors, incomplete outcome data (primary outcome data reporting, dropout rates and reasons for withdrawal, appropriate imputation of missing data, an overall completion rate ≥80 %), and selective outcome reporting and other potential threats to validity (considering external validity, e.g., relevant use of co-interventions, bias due to funding source). An adjudicator (JS) resolved any disagreements not resolved by consensus. An expert rheumatologist (JS) and an expert in lupus (JG) examined for similarity of studies prior to performing evidence synthesis by the examination of similarity of study population and interventions.

Bayesian network meta-analysis (NMA)
We used Bayesian mixed treatment comparison (MTC) meta-analyses [22][23][24] to assess the comparative   [25,26] We conduced randomeffects NMA and assessed model fit and the choice of model (random vs. fixed effects) based on the assessment of the deviance information criterion (DIC) and the comparison of residual deviance to the number of unconstrained data points [25,27]. We assigned vague priors, such as N(0, 100 2 ) for basic parameters throughout [25] and informative priors for the variance parameter based on Turner et al. [28]. We evaluated the model diagnostics including trace plots and the Brooks-Gelman-Rubin statistic to ensure model convergence [25,29]. We fit three chains in WinBUGS for each analysis, with 40,000 iterations, and a burn-in of 40,000 iterations [29,30] Both MTC and traditional meta-analysis require studies to be sufficiently similar in order to pool their results. We investigated heterogeneity, where warranted, with subgroup analyses and meta-regressions [26,31]. We examined consistency-inconsistency plots for evidence of inconsistency, and chose the appropriate model for our analyses. We obtained point estimates using odds ratios (OR) and 95 % credible intervals (CrI) using Markov Chain Monte Carlo (MCMC) methods. Transformation of the OR to relative risk (RR) and risk difference was done to allow ease for interpretation for clinicians and patients. The quality of evidence was assessed as recommended in a recent study [32].
Sensitivity analysis was performed by limiting analyses to partial/complete remission rather than combining this with renal response for the composite renal remission/response. We constructed staircase diagrams, another pictorial way to see comparisons of various treatments to each other. Rankograms were constructed to model the probabilities of the treatment rankings, representing the best to the last ranks.

Study characteristics
Sixty-five studies met inclusion and exclusion criteria that included CYC, MMF, AZA, calcineurin inhibitors (cyclosporine, tacrolimus), rituximab, corticosteroids, plasmapharesis, or leflunomide (Fig. 1). The Additional file 2 shows the search strategy. An additional file shows the PRISMA checklist (see Additional file 3). Network diagrams for all outcomes are shown in Fig. 2. Most studies (88 %) compared induction or induction and maintenance regimens. An additional file shows this in more detail (see Additional file 4). The study sample size ranged from 10 to 370. Of these, 32 % of the studies were conducted in the USA and 43 % were multicenter.
A detailed risk of bias using the Cochrane risk of bias tool is provided in Table 1. Randomization was low-risk in 56 %, unclear in 39 % and high-risk in 5 % (Table 1). Most trials were low-risk for blinding of assessor (59 %), blinding of participant (54 %), intention to treat (57 %). On the other hand, only 38 % of trials were low-risk for allocation concealment and it was unclear in 59 %.
Although some clinical heterogeneity was detected between trials overall, we did not notice any clinically significant systematic differences in patient populations or disease stages between various medications.
Treatment efficacy: complete/partial renal remission/response Thirty-seven trials with 2697 patients provided data for the composite outcome, partial or complete renal remission or renal response (two trials were excluded since they had variable duration of treatments based on response to initial treatment, also associated with high standard errors and wide CrI, leading to problems with convergence of the model when included). There were 34 two-arm and three three-arm trials. Table 2 shows only the significant odds ratios, relative risk and risk differences only, and an additional file shows all comparisons in more detail (see Additional file 5). CYC, MMF, CSA, and TAC were superior to corticosteroids alone in achieving renal remission/response ( Table 2). CYC low dose (LD) was less likely than MMF, TAC, CSA, and CYC and CYC HD less likely than MMF and CSA to achieve renal remission/response. CSA was more likely than plasmapharesis and azathioprine to achieve renal remission/response ( Table 2). The quality of evidence The odds ratios were transformed to relative risk (RR) and risk difference was done to allow ease for interpretation for clinicians and patients was rated as moderate (downgraded for imprecision). Absolute event rates ranged from 28 to 75 % and are shown in more detail in an additional file (see Additional file 6).
Treatment failure: renal relapse/renal flare Thirteen studies with 1,108 patients provided data; 11 were two-arm and two were three-arm studies. MMF and CYC were associated with a lower rate of renal relapse/flare compared to PRED and MMF was associated with a lower rate of renal relapse/flare than AZA ( Table 3). The quality of evidence was rated as moderate (downgraded for imprecision). The event rates ranged from 14 to 49 % and are shown in more detail in an additional file (see Additional file 6).

Amenorrhea/ovarian failure
Eight RCTs with 839 patients provided data; seven were two-arm and one trial was a three-arm trial. CYC was more likely than MMF and PRED to be associated with amenorrhea/ovarian failure (Table 4). CYC LD was associated with higher risk of amenorrhea/ovarian failure than MMF. The quality of evidence was rated as moderate (downgraded for imprecision). Absolute event rates ranged from 8 to 61 % and are shown in more detail in an additional file (see Additional file 6).

Bone marrow toxicity: cytopenia (including leucopenia)
Sixteen trials provided data on 2257 patients: 15 were two-arm and one was a three-arm trial. Compared to MMF, several immunosuppressive drugs were associated with a higher risk of cytopenia: CYC, AZA, CYC LD, Based on 13 RCTs with 1108 patients: 11 two-arm trials and two three-arm trials Significant odds ratios are in italics For absolute rates for events used for calculation of risk difference, please see Appendix 6 OR odds ratio, RR relative risk, RD risk difference, CrI credible interval, CYC cyclophosphamide, MMF mycophenolate mofetil, CSA cyclosporine, TAC tacrolimus, LEF leflunomide, PRED prednisone, prednisolone or methylprednisolone, AZA azathioprine, RTX rituximab CYC + AZA CYC with AZA, MMF-AZA MMF followed by AZA Merged doses for PRED and CYC and comparing only between treatments. We did not lose any study, but it is a limitation of this analysis and CYC HD ( Table 5). The quality of evidence was rated as moderate (downgraded for imprecision). Absolute event rates ranged from 7 to 30 % and are shown in more detail in an additional file (see Additional file 6).

Sensitivity analyses limited to partial or complete renal remission only
An additional file shows in detail the odds ratios for comparisons of immunosuppressive drugs and corticosteroids in lupus nephritis for partial/complete remission, a sensitivity analysis (renal response excluded from the composite outcome; see Additional file 7). Results were similar to the main analyses, with minor exceptions. Thus, most observations from the main analysis were confirmed in this sensitivity analysis. Figure 2 shows the Rankograms for various treatments for the outcomes of interest. Among the top ranked were CSA for renal remission/response, prednisone for renal relapse/flare, CYC for ovarian failure/amenorrhea and CYC for bone marrow toxicity (Fig. 3). An additional file shows in detail comparisons of various treatments to each other for each outcome, another way to visualize the key comparisons between treatments (see Additional file 8).

Discussion
In this systematic review and NMA of outcomes in patients with lupus nephritis, we made several important observations. Results of this study are of great value to both medical and patient communities, given the growing and renewed focus on patient-centered education and outcomes. We directly compared benefits and risks of medications that are used to treat SLE nephritis. The information presented here served as the foundation for a decision-aid tool that can be easily understood that is being tested in a randomized trial, currently underway. Given the novel NMA methodology and our ability to perform indirect comparisons using the NMA, several of our findings are new and merit further discussion. We noted differences between immunosuppressive drugs regarding renal remission/response. Interestingly, high-dose CYC was significantly less effective than MMF in leading to renal remission/response with a relative risk of 0.65 and an odds ratio of 0.40. This is an interesting finding and is consistent with the trial data from Ginzler et al. that found that MMF was superior to IV CYC in inducing complete renal remission [37].
We also found that CYC LD (which includes EUROlupus regimen) was inferior to MMF, CSA and CYC standard dose for renal remission/response. We realize that not all trials are same; however, the superiority of these commonly used immunosuppressive drugs to CYC LD is worth some discussion. This benefit of using low-dose CYC must be weighed against the potential harms of using MMF, CSA or CYC standard dose in a patient with lupus nephritis. Several other differences we noted might be of interest to clinicians and patients when making treatment decisions for lupus nephritis. We confirmed that immunosuppressive drugs were more effective than corticosteroids alone (most often prednisone or prednisolone) for renal remission/response by two to four times, that is, CYC, MMF, CSA, and tacrolimus led to renal remission significantly more often than corticosteroids alone. The lower risk of renal relapse or flare with MMF and CYC compared to corticosteroids alone is also supportive of this finding and not surprising. Other immunosuppressives (AZA, CSA, CYC-AZA, MMF-AZA) also seemed to be possibly more effective than corticosteroids alone for preventing renal relapse/flare, but did not reach significance.
Significant differences were found in the risk of amenorrhea/ovarian failure (fertility issues). These are important findings, since a disproportionate number of young women are affected by lupus. An increased risk of amenorrhea/ovarian failure was seen with CYC compared to MMF and PRED, which confirms a long-known clinical observation, but now provides estimates of the comparative risk. Ovarian failure is an important discussion point during lupus treatment decision-making in young women, especially when the use of CYC is considered.
We found that CYC SD, CYC LD, CYC HD, and AZA had two to five times higher risk of cytopenia than MMF. Our comprehensive NMA provided a robust treatment estimates for these comparisons that can be shared with patients in a more understandable format at the time of treatment decision-making for lupus nephritis in regular clinical care.
Our study has several limitations. Meta-analyses are observational studies and therefore can have limitations of any observational study. Heterogeneity is an issue with all meta-analyses, including NMA. We assessed for clinical heterogeneity of studies prior to conducting the analyses, with the help of two clinicians (including a rheumatologist and a lupus expert) who assessed for systematic differences between study populations and disease stage by medications regimens. We noted some clinical heterogeneity between trials, but did not find major systematic differences by the type of medication used. We acknowledge that no two clinical trials are alike. This applies to our NMA as well, and therefore, the results should be interpreted with some caution. Another study limitation may be that we searched two databases. Based on 16 RCTs with 2257 patients: 15 two-arm trials and one three-arm trial Significant odds ratios are in italics For absolute rates for events used for calculation of risk difference, please see Appendix 6 OR odds ratio, RR relative risk, RD risk difference, CrI credible interval, HD high dose, LD low dose; when not specified, it indicates standard dose, CYC cyclophosphamide, MMF mycophenolate mofetil, CSA cyclosporine, TAC tacrolimus, LEF leflunomide, PRED prednisone, prednisolone or methylprednisolone, AZA azathioprine, RTX rituximab MMF-AZA, MMF followed by AZA RTX + MMF, RTX combined with MMF Estimates for LEF HD were obtained from data from only one study and were therefore imprecise The odds ratios were transformed to relative risk (RR) and risk difference was done to allow ease for interpretation for clinicians and patients The NMA incorporates both direct and indirect comparisons. As can be seen from the network diagrams for these analyses, for some outcomes assessed in this study, direct comparisons were fewer, which indicates that most evidence came from indirect comparisons; addition of data from direct comparator trials in future NMA would increase the confidence in these findings. Due to multiple NMA comparisons performed, some findings may be resulting from chance. However, we think that type II error, i.e., missing important differences due to small sample size, is a bigger concern, since most included trials for this rare condition, were of small size.
For comparisons with large relative effects (odds or risk ratios), one should keep in mind the underlying rates, which varied between outcomes and were very low for some composite outcomes. In these cases, the absolute difference between treatments is still small, even though relative effect may be five or ten times, or 0.05 times. Comparisons with wider confidence intervals must be interpreted with caution, since these indicate small study numbers, or a lower confidence in the certainty of the estimate. It is possible (and even likely sometimes) that addition of more data from studies in the future may change these estimates.  Fig. 3 Rankograms for composite study outcomes in lupus nephritis, renal remission or renal response (a), renal relapse or renal flare (b), fertility issues (c), and bone marrow toxicity (d). Legend: This two-dimensional plot show on the x-axis (horizontal) the possible ranks of the treatment from best to the last ranks and on the y-axis (vertical) the probability of each of the treatments to assume those possible ranks for each outcome. For example, for renal relapse/flare (an undesired outcome), the highest probability was evident with corticosteroids alone, followed by MMF-AZA, followed by AZA and others