Methods comply fully with the PRISMA 2009 Checklist (Additional file 1) and with our published protocol [3], which does not have a PROSPERO registration number.
Search strategy, data sources and trial eligibility
We conducted a systematic literature search to identify RCTs that compared non-individualised homeopathy with a placebo, for any clinical condition [11]. Each of the following electronic databases was searched from its inception up to the end of 2011, with updated searches of the same databases up to the end of 2014: AMED; CAM-Quest®; CINAHL; Cochrane Central Register of Controlled Trials; Embase; Hom-Inform; LILACS; PubMed; Science Citation Index and Scopus. For the update, CORE-Hom® was also searched, using the term ‘randomised’ or ‘unknown’ in the Sequence Generation field.
The full electronic search strategy for PubMed (Cochrane Highly Sensitive Search Strategy) is given in our previous paper [11]: “((homeopath* or homoeopath*) and ((randomized controlled trial [pt]) or (controlled clinical trial [pt]) or (randomized [tiab]) or (placebo [tiab]) or (clinical trials as topic [mesh:noexp]) or (randomly [tiab]) or (trial [ti]))) not (animals [mh] not humans [mh])”.
As stated in our published protocol [3], we then excluded trials: of crossover design; of radionically prepared homeopathic medicines; of homeopathic prophylaxis; of homeopathy combined with other (complementary or conventional) intervention; for other specified reasons. The final explicit exclusion criterion was that there was obviously no blinding of participants and practitioners to the assigned intervention; for example, a trial described by the original authors as ‘single [i.e. patient-] blinded’ was automatically excluded. All remaining trials were eligible for systematic review.
Outcome definitions
For each trial, and for the purposes of risk-of-bias assessment and meta-analysis, we identified a single ‘main outcome measure’ using a refinement of the approaches adopted by Linde et al. [7] and by Shang et al. [9]. Each trial’s ‘main outcome measure’ was identified based on the following hierarchical ranking order (consistent with the WHO International Classification of Functioning (ICF) linked to health condition [12]):
-
Mortality
-
Morbidity
-
Health impairment (loss/abnormality of function, incl. presence of pain)
-
Limitation of activity (disability, incl. days off work/school because of ill health)
-
Restriction of participation (quality of life)
-
Surrogate outcome (e.g. blood test data, bone mineral density).
We followed the WHO ICF system regardless of what measure may have been identified by the investigators as their ‘primary outcome’. In cases where, in the judgment of the reviewers, there were two or more outcome measures of equal greatest importance within the WHO ICF rank order, the designated ‘main outcome measure’ was selected randomly from those two or more options using the toss of coins or dice.
Unless otherwise indicated, the single end-point (measured from the start of the intervention) associated with the designated ‘main outcome measure’ was taken as the last follow-up at which data were reported for that outcome.
Data extraction
Two reviewers (RTM and either JC, JRTD, LL, SM, NR or C-MM) identified the main outcome measure and then independently extracted data for each trial using a standard recording approach [3]. The data extracted per trial included, as appropriate: demographics of participants (gender, age range, medical condition); study setting; potency or potencies of homeopathic medicines; whether a pilot trial; ‘main outcome measure’ (see above) and measured end-point; funding source/s. The statistical items noted were whether statistical power calculation carried out; whether intention-to-treat (ITT) analysis; sample size and missing data for each intervention group. Discrepancies in the interpretation of data were discussed and resolved by consensus.
Assessment of risk of bias
We used the domains of assessment as per the Cochrane risk-of-bias appraisal tool [13]. The extracted information enabled appraisal of freedom from risk of bias per domain: ‘Yes’ (low risk), ‘Unclear’ risk or ‘No’ (high risk). We applied this approach to each of the seven domains: sequence generation (domain I); allocation concealment used to implement the random sequence (II); blinding of participants and study personnel (IIIa); blinding of outcome assessors (IIIb); incomplete outcome data (IV); selective outcome reporting (V); other sources of bias (VI). The source of any research sponsorship (i.e. potential for vested interest) was taken into account for sub-group analysis (see below), but not in risk-of-bias assessment per se.
Reflecting appropriately the designated main outcome measure, we rated risk of bias for each trial across all seven domains and using the following classification [3]:
-
Rating A = Low risk of bias in all seven domains.
-
Rating Bx = Uncertain risk of bias in x domains; low risk of bias in all other domains.
-
Rating Cy.x = High risk of bias in y domains; uncertain risk of bias in x domains; low risk of bias in all other domains.
Designating an RCT as ‘reliable evidence’
An ‘A’-rated trial was designated reliable evidence. We also designated a ‘B1’-rated trial reliable evidence if the uncertainty in its risk of bias was for one of domains IV, V or VI only (i.e. it was required to be judged free of bias for each of domains I, II, IIIA and IIIB) [3]; in tabulations and text below, this rating is shown as ‘B1* (minimal risk of bias)’.
Study selection for meta-analysis
All RCTs that were included in the systematic review were potentially eligible for meta-analysis. If the original RCT paper did not provide adequate information on our selected main outcome measure to enable calculation of the SMD or the OR, we excluded the trial from the meta-analysis, and described the outcome as ‘not estimable’; consistent with Cochrane assessment criteria [13], such a trial was thus attributed high risk of bias in domain V.
Statistical analysis
Data preparation
For a continuous main outcome measure, the mean, standard deviation (SD) and number of subjects were extracted for homeopathy and placebo groups and the unbiased standardised mean difference (SMD) calculated, so that a negative SMD reflected a difference in favour of homeopathy. We did not adjust values to compensate for any inter-group differences at baseline. For a dichotomous main outcome measure, the number of subjects with a favourable outcome and the total number of subjects in each group were extracted to enable calculation of the odds ratio (OR), with values greater than 1 reflecting a difference in favour of homeopathy.
For a given trial comprising more than two study groups, only the data concerning comparisons between non-individualised homeopathy and placebo were extracted from the paper. For a trial in which there were two or more homeopathy groups, those groups’ data were combined in analysis where relevant and feasible: for a dichotomous measure, combining data merely required summing the events and sample sizes; for a continuous measure, combining data was feasible only where SD was derivableFootnote 1.
For the pooled meta-analysis, a single measure of effect size was required to enable pooling of all relevant trials: ORs were transformed to SMD using a recognised approximation method [14]. This is a deviation from the protocol, which stated that SMD would be transformed to OR, as in a previous paper [2]. SMD and OR are equally valid statistics. The reasoning behind using SMD instead of OR is that the latter is intuitively associated with a dichotomous outcome, whereas the former has a direct connection with ‘effect size’ and indicates that, for the meta-analysis, it has been derived via transformation from other measures (including OR). Whichever of these two metrics is used, their results are interchangeable and their interpretation is identical. ‘Effect size’ was interpreted as follows: SMD <0.40 = ‘small’; SMD 0.40 to 0.70 = ‘moderate’; SMD >0.70 = ‘large’ [14]. Via the SMD-to-OR transformation factor above [14], these values correspond, respectively, to: OR <2.10 = ‘small’; OR 2.10 to 3.60 = ‘moderate’; OR >3.60 = ‘large’, which we used for our previous paper [2].
Heterogeneity and publication bias
Due to the known clinical heterogeneity between studies, random-effects meta-analysis regression models [15] were used to derive pooled estimators and for sub-group / moderator analyses. Estimates were derived along with their 95% confidence intervals (CI) and p values. The I
2 statistic was used to assess the variability between studies: it gives the percentage of the total variability in the estimated effect size (which is composed of between-study heterogeneity plus sampling variability) that is attributable to heterogeneity. The I
2 statistic can take values between 0 and 100%: I
2 = 0% means that all of the heterogeneity is due to sampling error, and I
2 = 100% means that all variability is due to true heterogeneity between studies.
Funnel plots and Egger’s test of asymmetry [16, 17] and the ‘trim-and-fill’ method [18, 19] were used to assess the impact of publication bias.
All statistical analyses were carried out in R version 3.1.2 and using the meta package [20].
Sensitivity analysis
The sensitivity analysis aimed to ascertain the impact of trials’ risk-of-bias rating on the pooled SMD: we examined the effect of cumulatively removing data from the meta-analysis by each trials’ rating, beginning with the lowest ranked ‘C’-rated trial/s.
Sub-group analysis
Included in sub-group analysis was whether a trial: (a) had been included or not in previous meta-analysis [9]; (b) was a ‘pilot’ study; (c) necessitated our use of imputed data for the meta-analysis; (d) was free of vested interest; (e) investigated either an ‘acute’ or a ‘chronic’ clinical condition.
As was implicit in the study protocol [3], and as presented in a previous paper [2], we also included the following in sub-group analysis: (f) whether a trial had sample size that was greater or less than the median for those included in meta-analysis; (g) whether a trial used homeopathic medicine/s with potency ≥12C or <12C (12-times serial dilution of 1:100 starting solution), a concentration sometimes regarded as equivalent to the ‘Avogadro limit’ for molecular dose [21]; potency was defined as ‘mixed’ if a combination medicine in a given trial comprised a mixture of ≥12C and <12C potencies.
As recognised by Cochrane, some issues suitable for such analysis are identified during the review process itself [22]. Thus, we additionally carried out sub-group analysis depending on whether (h) a trial had investigated a combination, an OTC complex, an isopathic or a single remedy.
Disease-specific treatment effect of non-individualised homeopathy
Analysis was carried out by clinical condition, in cases where there were ≥2 RCTs with extractable main outcome. Analysis was additionally carried out by category of clinical condition, including each category for which there were data from ≥2 RCTs. RCT nomenclature for clinical conditions and their categories was previously characterised [11]Footnote 2.
All sub-group analyses were conducted before and after removal of ‘C’-rated trials [2].