Randomised, double-blind, placebo-controlled trials of non-individualised homeopathic treatment: systematic review and meta-analysis

Background A rigorous systematic review and meta-analysis focused on randomised controlled trials (RCTs) of non-individualised homeopathic treatment has not previously been reported. We tested the null hypothesis that the main outcome of treatment using a non-individualised (standardised) homeopathic medicine is indistinguishable from that of placebo. An additional aim was to quantify any condition-specific effects of non-individualised homeopathic treatment. Methods Literature search strategy, data extraction and statistical analysis all followed the methods described in a pre-published protocol. A trial comprised ‘reliable evidence’ if its risk of bias was low or it was unclear in one specified domain of assessment. ‘Effect size’ was reported as standardised mean difference (SMD), with arithmetic transformation for dichotomous data carried out as required; a negative SMD indicated an effect favouring homeopathy. Results Forty-eight different clinical conditions were represented in 75 eligible RCTs. Forty-nine trials were classed as ‘high risk of bias’ and 23 as ‘uncertain risk of bias’; the remaining three, clinically heterogeneous, trials displayed sufficiently low risk of bias to be designated reliable evidence. Fifty-four trials had extractable data: pooled SMD was –0.33 (95% confidence interval (CI) –0.44, –0.21), which was attenuated to –0.16 (95% CI –0.31, –0.02) after adjustment for publication bias. The three trials with reliable evidence yielded a non-significant pooled SMD: –0.18 (95% CI –0.46, 0.09). There was no single clinical condition for which meta-analysis included reliable evidence. Conclusions The quality of the body of evidence is low. A meta-analysis of all extractable data leads to rejection of our null hypothesis, but analysis of a small sub-group of reliable evidence does not support that rejection. Reliable evidence is lacking in condition-specific meta-analyses, precluding relevant conclusions. Better designed and more rigorous RCTs are needed in order to develop an evidence base that can decisively provide reliable effect estimates of non-individualised homeopathic treatment. Electronic supplementary material The online version of this article (doi:10.1186/s13643-017-0445-3) contains supplementary material, which is available to authorized users.


Background
Homeopathy is a system of medicine based fundamentally on the 'Principle of Similars': a substance capable of causing symptoms of illness in a healthy subject can be used as a medicine to treat similar patterns of symptoms experienced by an individual who is ill; homeopathic medicines are believed to stimulate a self-regulatory healing response in the patient [1]. There are several distinct forms of homeopathy, the main types being 'individualised homeopathy' , 'clinical homeopathy' and 'isopathy'. In individualised homeopathy, typically a single homeopathic medicine is selected on the basis of the 'total symptom picture' of a patient, including his/her mental, general and constitutional type. In clinical homeopathy, one or more homeopathic medicines are administered for standard clinical situations or conventional diagnoses; where more than one medicine is used in a fixed preparation, it is referred to as a 'combination' (devised by researchers) or 'complex' homeopathic medicine (available as an overthe-counter [OTC] proprietary formulation). Isopathy is the use of homeopathic dilutions from the causative agent of the disease itself, or from a product of the disease process, to treat the condition [1]: isopathic medicines include organisms and allergens prescribed on a basis that is different from individualised homeopathic prescribing in the classical sense.
To inform appropriate research development in homeopathy, the nature of its existing research evidence needs to be examined with rigour, objectivity and transparency. In a previous systematic review of randomised controlled trials (RCTs) of individualised treatment, we concluded there was a small, statistically significant, effect of the individually prescribed homeopathic medicines that was robust to sensitivity analysis based on reliable evidence; however, the low or uncertain quality of the evidence prevented a decisive conclusion [2].
In contrast to individualised treatment, placebocontrolled RCTs of non-individualised homeopathic treatment evaluate interventions that have involved the same, standardised, medication allocated to each and every participant randomised to homeopathy in a given trial: single homeopathic medicine, combination or complex homeopathic medicine, or isopathy. In this RCT context, none of these approaches involves matching a patient with the 'total symptom picture' of an individually prescribed homeopathic medicine: a pre-selected medicine is applied to the typical symptoms of a clinical condition. In the analysis reported in the present paper, we therefore regard all trials of non-individualised homeopathic treatment as, in effect, testing the same intervention. A study protocol for this systematic review has been published [3].
Three of five prior comprehensive reviews of homeopathy RCTs, reflecting the broad spectrum of clinical conditions that has been researched, reached the guarded conclusion that the homeopathic intervention probably differs from placebo [4][5][6]. The fourth such review concluded, 'The results of our meta-analysis are not compatible with the hypothesis that the clinical effects of homeopathy are completely due to placebo' [7], though the same authors later published supplementary analysis that weakened this conclusion [8]. The fifth of these global systematic reviews concluded there was "weak evidence for a specific effect of homoeopathic remedies…compatible with the notion that the clinical effects of homoeopathy are placebo effects" [9]. In their approach, however, each of these 'global' reviews has assessed collectively the findings for individualised and non-individualised homeopathy, a method we regard as inappropriate due to the distinction between the two types of intervention in the RCT context. There have been two systematic reviews, with meta-analysis, of individualised homeopathy trials: the first was published in 1998 [10], the most recent in 2014 [2]. A focused meta-analysis of non-individualised homeopathy RCTs has not previously been reported.
In order to synthesise the findings from placebocontrolled RCTs of non-individualised homeopathy we conducted an up-to-date systematic review and metaanalysis, testing the following null hypothesis: across the entire range of clinical conditions that have been researched, the main outcome of treatment using a nonindividualised homeopathic medicine cannot be distinguished from that using placebo. An additional aim, further informing future research, was to quantify any effect of non-individualised homeopathic treatment for each clinical condition for which there is more than a single eligible RCT.

Methods
Methods comply fully with the PRISMA 2009 Checklist (Additional file 1) and with our published protocol [3], which does not have a PROSPERO registration number.

Search strategy, data sources and trial eligibility
We conducted a systematic literature search to identify RCTs that compared non-individualised homeopathy with a placebo, for any clinical condition [11]. Each of the following electronic databases was searched from its inception up to the end of 2011, with updated searches of the same databases up to the end of 2014: AMED; CAM-Quest®; CINAHL; Cochrane Central Register of Controlled Trials; Embase; Hom-Inform; LILACS; PubMed; Science Citation Index and Scopus. For the update, CORE-Hom® was also searched, using the term 'randomised' or 'unknown' in the Sequence Generation field.
The full electronic search strategy for PubMed (Cochrane Highly Sensitive Search Strategy) is given in our previous paper [11]: "((homeopath* or homoeopath*) and ((randomized controlled trial [ As stated in our published protocol [3], we then excluded trials: of crossover design; of radionically prepared homeopathic medicines; of homeopathic prophylaxis; of homeopathy combined with other (complementary or conventional) intervention; for other specified reasons. The final explicit exclusion criterion was that there was obviously no blinding of participants and practitioners to the assigned intervention; for example, a trial described by the original authors as 'single [i.e. patient-] blinded' was automatically excluded. All remaining trials were eligible for systematic review.

Outcome definitions
For each trial, and for the purposes of risk-of-bias assessment and meta-analysis, we identified a single 'main outcome measure' using a refinement of the approaches adopted by Linde et al. [7] and by Shang et al. [9]. Each trial's 'main outcome measure' was identified based on the following hierarchical ranking order (consistent with the WHO International Classification of Functioning (ICF) linked to health condition [12]):

Mortality
Morbidity ○ Treatment failure ○ Pathology; symptoms of disease Health impairment (loss/abnormality of function, incl. presence of pain) Limitation of activity (disability, incl. days off work/ school because of ill health) Restriction of participation (quality of life) Surrogate outcome (e.g. blood test data, bone mineral density).
We followed the WHO ICF system regardless of what measure may have been identified by the investigators as their 'primary outcome'. In cases where, in the judgment of the reviewers, there were two or more outcome measures of equal greatest importance within the WHO ICF rank order, the designated 'main outcome measure' was selected randomly from those two or more options using the toss of coins or dice.
Unless otherwise indicated, the single end-point (measured from the start of the intervention) associated with the designated 'main outcome measure' was taken as the last follow-up at which data were reported for that outcome.

Data extraction
Two reviewers (RTM and either JC, JRTD, LL, SM, NR or C-MM) identified the main outcome measure and then independently extracted data for each trial using a standard recording approach [3]. The data extracted per trial included, as appropriate: demographics of participants (gender, age range, medical condition); study setting; potency or potencies of homeopathic medicines; whether a pilot trial; 'main outcome measure' (see above) and measured end-point; funding source/s. The statistical items noted were whether statistical power calculation carried out; whether intention-to-treat (ITT) analysis; sample size and missing data for each intervention group. Discrepancies in the interpretation of data were discussed and resolved by consensus.

Assessment of risk of bias
We used the domains of assessment as per the Cochrane risk-of-bias appraisal tool [13]. The extracted information enabled appraisal of freedom from risk of bias per domain: 'Yes' (low risk), 'Unclear' risk or 'No' (high risk). We applied this approach to each of the seven domains: sequence generation (domain I); allocation concealment used to implement the random sequence (II); blinding of participants and study personnel (IIIa); blinding of outcome assessors (IIIb); incomplete outcome data (IV); selective outcome reporting (V); other sources of bias (VI). The source of any research sponsorship (i.e. potential for vested interest) was taken into account for sub-group analysis (see below), but not in risk-of-bias assessment per se.
Reflecting appropriately the designated main outcome measure, we rated risk of bias for each trial across all seven domains and using the following classification [3]: Rating A = Low risk of bias in all seven domains. Rating Bx = Uncertain risk of bias in x domains; low risk of bias in all other domains. Rating Cy.x = High risk of bias in y domains; uncertain risk of bias in x domains; low risk of bias in all other domains.
Designating an RCT as 'reliable evidence' An ' A'-rated trial was designated reliable evidence. We also designated a 'B1'-rated trial reliable evidence if the uncertainty in its risk of bias was for one of domains IV, V or VI only (i.e. it was required to be judged free of bias for each of domains I, II, IIIA and IIIB) [3]; in tabulations and text below, this rating is shown as 'B1* (minimal risk of bias)'.

Study selection for meta-analysis
All RCTs that were included in the systematic review were potentially eligible for meta-analysis. If the original RCT paper did not provide adequate information on our selected main outcome measure to enable calculation of the SMD or the OR, we excluded the trial from the metaanalysis, and described the outcome as 'not estimable'; consistent with Cochrane assessment criteria [13], such a trial was thus attributed high risk of bias in domain V.

Statistical analysis Data preparation
For a continuous main outcome measure, the mean, standard deviation (SD) and number of subjects were extracted for homeopathy and placebo groups and the unbiased standardised mean difference (SMD) calculated, so that a negative SMD reflected a difference in favour of homeopathy. We did not adjust values to compensate for any inter-group differences at baseline. For a dichotomous main outcome measure, the number of subjects with a favourable outcome and the total number of subjects in each group were extracted to enable calculation of the odds ratio (OR), with values greater than 1 reflecting a difference in favour of homeopathy.
For a given trial comprising more than two study groups, only the data concerning comparisons between non-individualised homeopathy and placebo were extracted from the paper. For a trial in which there were two or more homeopathy groups, those groups' data were combined in analysis where relevant and feasible: for a dichotomous measure, combining data merely required summing the events and sample sizes; for a continuous measure, combining data was feasible only where SD was derivable 1 .
For the pooled meta-analysis, a single measure of effect size was required to enable pooling of all relevant trials: ORs were transformed to SMD using a recognised approximation method [14]. This is a deviation from the protocol, which stated that SMD would be transformed to OR, as in a previous paper [2]. SMD and OR are equally valid statistics. The reasoning behind using SMD instead of OR is that the latter is intuitively associated with a dichotomous outcome, whereas the former has a direct connection with 'effect size' and indicates that, for the meta-analysis, it has been derived via transformation from other measures (including OR). Whichever of these two metrics is used, their results are interchangeable and their interpretation is identical. 'Effect size' was interpreted as follows: SMD <0.40 = 'small'; SMD 0.40 to 0.70 = 'moderate'; SMD >0.70 = 'large' [14]. Via the SMD-to-OR transformation factor above [14], these values correspond, respectively, to: OR <2.10 = 'small'; OR 2.10 to 3.60 = 'moderate'; OR >3.60 = 'large' , which we used for our previous paper [2].

Heterogeneity and publication bias
Due to the known clinical heterogeneity between studies, random-effects meta-analysis regression models [15] were used to derive pooled estimators and for sub-group / moderator analyses. Estimates were derived along with their 95% confidence intervals (CI) and p values. The I 2 statistic was used to assess the variability between studies: it gives the percentage of the total variability in the estimated effect size (which is composed of between-study heterogeneity plus sampling variability) that is attributable to heterogeneity. The I 2 statistic can take values between 0 and 100%: I 2 = 0% means that all of the heterogeneity is due to sampling error, and I 2 = 100% means that all variability is due to true heterogeneity between studies.
Funnel plots and Egger's test of asymmetry [16,17] and the 'trim-and-fill' method [18,19] were used to assess the impact of publication bias.
All statistical analyses were carried out in R version 3.1.2 and using the meta package [20].

Sensitivity analysis
The sensitivity analysis aimed to ascertain the impact of trials' risk-of-bias rating on the pooled SMD: we examined the effect of cumulatively removing data from the meta-analysis by each trials' rating, beginning with the lowest ranked 'C'-rated trial/s.

Sub-group analysis
Included in sub-group analysis was whether a trial: (a) had been included or not in previous meta-analysis [9]; (b) was a 'pilot' study; (c) necessitated our use of imputed data for the meta-analysis; (d) was free of vested interest; (e) investigated either an 'acute' or a 'chronic' clinical condition.
As was implicit in the study protocol [3], and as presented in a previous paper [2], we also included the following in sub-group analysis: (f) whether a trial had sample size that was greater or less than the median for those included in meta-analysis; (g) whether a trial used homeopathic medicine/s with potency ≥12C or <12C (12-times serial dilution of 1:100 starting solution), a concentration sometimes regarded as equivalent to the ' Avogadro limit' for molecular dose [21]; potency was defined as 'mixed' if a combination medicine in a given trial comprised a mixture of ≥12C and <12C potencies.
As recognised by Cochrane, some issues suitable for such analysis are identified during the review process itself [22]. Thus, we additionally carried out sub-group analysis depending on whether (h) a trial had investigated a combination, an OTC complex, an isopathic or a single remedy.

Disease-specific treatment effect of non-individualised homeopathy
Analysis was carried out by clinical condition, in cases where there were ≥2 RCTs with extractable main outcome. Analysis was additionally carried out by category of clinical condition, including each category for which there were data from ≥2 RCTs. RCT nomenclature for clinical conditions and their categories was previously characterised [11] 2 .
All sub-group analyses were conducted before and after removal of 'C'-rated trials [2].

Included studies
The PRISMA flowchart from the original comprehensive literature search (up to and including 2011) was published previously [11]. An updated PRISMA flowchart is given in Fig. 1, identifying a total of 553 records. 3 Four-hundred and fifty-four remained after removal of duplicates. After excluding 95 due to type of record (book chapter, thesis, abstract and other minor article), Three-hundred and fifty-nine full-text records were then assessed for eligibility. Two-hundred and eighty-seven were excluded for the general reasons summarised in Fig. 1; 38 of these same 287 were excluded from the present systematic review for the additionally specified reasons shown in Additional file 2. 4 The finally remaining 72 records (75 RCTs) were thus included in this systematic review; data were not extractable from 21 of those, leaving 51 records (54 RCTs) available for meta-analysis-see Additional file 2 for details of the 21 records excluded from metaanalysis.

Characteristics of included studies
The 75 RCTs represent 48 different clinical conditions across 15 categories ( Table 1). Each of 52 RCTs studied a condition that was acute in nature; each of 23 studied a chronic condition. Homeopathic potency was ≥12C in 29 trials, and not exclusively ≥12C for 7 trials (mix of >12C and <12C for 6 trials; unstated for 1 trial); potency was <12C in 39 trials. Seventeen trials were free of vested interest; 24 trials were not free of vested interest; 34 trials did not enable certainty in this assessment.

Summary of findings
For each trial, Table 2 includes details of the sample size, the identified main outcome measure (and whether dichotomous or continuous) and the study end-point. Seventeen trials were described in the original paper as a 'pilot' (or 'preliminary' or 'feasibility') study. A power calculation was carried out for 28 of the trials. ITT was the basis for analysis in 21 trials. Mean attrition rate was 14.6%. The main outcome variable was dichotomous in 25 studies and continuous in the other 50. The total sample size for the 54 meta-analysable trials was 5032; the median sample size was 62.5 (inter-quartile range, 36 to 107). Meta-analysable studies included 45 different main outcome measures and for an end-point that ranged from 6 h to 6 months. Table 2 also indicates the 25 analysed trials in our study that we have in common with those included in the meta-analysis data reported by Shang et al. [9].      Table 3 provides the risk-of-bias details for each of the 75 trials, and sub-divided by: (a) the 54 that could be included in meta-analysis; (b) the 21 that could not be included in meta-analysis. Domains IV (completeness of outcome data), V (selective outcome reporting) and VI (other sources of bias) presented the greatest methodological concerns. Sixteen of 30 trials that were high risk of bias for domain V were so because their data were not extractable for meta-analysis (see Study selection for meta-analysis above). Domain II (allocation concealment) presented the most uncertain methodological judgments, with 55 (73%) trials assessed unclear risk of bias and only 14 (19%) low risk of bias. There were three trials with reliable evidence (two ' A'-rated, one 'B1*'-rated), 23 with uncertain risk of bias ('B'-rated), and 49 with high risk of bias ('C'rated). A summary risk-of-bias bar-graph is shown in Additional file 3. Table 3a (54 trials included in meta-analysis): Two trials were ' A'-rated (low risk of bias)-i.e. they fulfilled the criteria for all seven domains of assessment. Our criteria for reliable evidence were also satisfied for one 'B1*'-rated trial. Table 3a therefore includes three trials that were classed reliable evidence: Plumbum metallicum for lead poisoning (A103: Padilha); the OTC complex Acthéane for menopausal syndrome (A272: Colau); the OTC complex Traumeel S for post-operative pain (A120: Singer). Each of the other 51 trials had uncertain or high risk of bias in important methodological aspects, and may be regarded as non-reliable evidence: 23 trials were classed as uncertain risk of bias; 28 were classed as high risk of bias.

Risk of bias and reliable evidence
Table 3b (21 trials excluded from meta-analysis): All of these 21 trials are 'C'-rated (high risk of bias). Thirteen of the 21 were seriously flawed in more than one domain of assessment (i.e. rated 'C2.0' or worse).
Seven of the remaining eight trials were 'C'-rated solely because of data extraction issues: only one of those seven (A80: Jacobs) fulfilled 'low risk-of-bias' criteria for all other domains of assessment, and so would otherwise have been designated reliable evidence.
The original data extracted per trial (continuous or dichotomous), together with the correspondingly calculated SMD or OR, are illustrated in Additional files 4a and b. Table 3 Risk-of-bias assessments for trials: (a) included in meta-analysis; (b) not included in meta-analysis (Continued) Except for domain V (data not extractable for meta-analysis), trial is otherwise uncertain risk of bias overall f Except for domain V (data not extractable for meta-analysis), trial is otherwise low risk of bias overall Of the 31 trials with continuous data, 9 had an effect statistically significantly favouring homeopathy (i.e. SMD < 0, with p ≤ 0.05); no trials had an effect significantly favouring placebo. The pooled effect estimate was SMD = -0.36 (95% CI -0.52, -0.19; p < 0.001). Of the 23 trials with dichotomous data, 6 had an effect statistically significantly favouring homeopathy (i.e. OR > 1, with p ≤ 0.05); no trials had an effect significantly favouring placebo. The pooled effect estimate was OR = 1.67 (95% CI 1.25, 2.23; p < 0.001).

Heterogeneity and publication bias
The statistical heterogeneity among the studies was high (I 2 = 65%) - Fig. 2.
Evidence of publication bias, toward studies favouring homeopathy, was apparent from the funnel plot (Fig. 3a), which suggested a relative absence of studies favouring placebo. Egger's test of asymmetry confirmed significant evidence of asymmetry in the funnel plot, p = 0.002. The estimated number of 'missing' studies was 11 (p for at least one 'missing' study was <0.001) - Fig. 3b. The effect estimate was attenuated when using the 'trim-andfill' method to adjust for publication bias: after adjustment for 'missing' studies, the pooled effect estimate was -0.16 (95% CI -0.31, -0.02; p = 0.023); the statistical heterogeneity among the studies remained high (I 2 = 79%).   From this risk-of-bias analysis, no significant difference was detected between the three pooled effect estimates (p = 0.417); meta-regression confirmed this finding (p = 0.617). There was thus no statistical evidence that effect estimates significantly differed depending on whether the body of evidence for a metaanalysis consisted of 'low' , 'uncertain' or 'high' risk-ofbias studies. Figure 5 shows the effect of cumulatively removing data by trials' risk-of-bias rating. The pooled SMD showed a statistically significant effect in favour of homeopathy for all trials collectively, through to and including those rated 'B3'; for the highest-rated trials collectively ('B2' , 'B1' and 'reliable evidence'), the pooled SMD still favoured homeopathy but was no longer statistically significant.

Sub-group analyses
The pooled SMD favoured homeopathy for all subgroups, though it was statistically non-significant for two of the 18 (data imputed; combination medicine): Fig. 6a. A meta-regression was performed to test specifically for within-group differences for each sub-group. The results showed that there were no significant differences between studies that were and were not: included in previous meta-analyses (p = 0.447); pilot studies (p = 0.316); greater than the median sample (p = 0.298); potency ≥ 12C (p = 0.221); imputed for meta-analysis (p = 0.384); free from vested interest (p = 0.391); acute/chronic (p = 0.796); different types of homeopathy (p = 0.217).
After removal of 'C'-rated trials (Fig. 6b), the pooled SMD still favoured homeopathy for all sub-groups, but was statistically non-significant for 10 of the 18 (included in previous meta-analysis; pilot study; sample size > median; potency ≥12C; data imputed; free of vested interest; not free of vested interest; combination medicine; single medicine; chronic condition). There remained no significant differences between sub-groups-with the exception of the analysis for sample size > median (p = 0.028).

Analysis by clinical condition
Clinical conditions Meta-analysis was possible for eight clinical conditions, each analysis comprising two to five trials (Fig. 7a). A statistically significant pooled SMD, favouring homeopathy, was observed for influenza (N = 2), irritable bowel syndrome (N = 2), and seasonal allergic rhinitis (N = 5). Each of the other five clinical conditions (allergic asthma, arsenic toxicity, infertility due to amenorrhoea, muscle soreness, post-operative pain) showed non-significant findings. Removal of 'C'-rated trials negated the statistically significant effect for seasonal allergic rhinitis and left the non-significant effect for post-operative pain unchanged (Fig. 7b); no higher-rated trials were available for additional analysis of arsenic toxicity, infertility due to amenorrhoea or irritable bowel syndrome. There were no 'C'-rated trials to remove for allergic asthma, influenza, or muscle soreness. Thus, influenza was the only clinical condition for which higher-rated trials indicated a statistically significant effect; neither of its contributing trials, however, comprised reliable evidence.

Categories of clinical condition
Meta-analysis was possible for 11 categories of clinical condition, each analysis comprising two to ten trials (Fig. 8a). A statistically significant pooled SMD,  favouring homeopathy, was observed for five categories: allergy and asthma (N = 10); cardiovascular (N = 2); dermatology (N = 2); ear nose and throat (N = 3); gastroenterology (N = 2). None of the trials designated reliable evidence featured in any of these five categories. Each of the other six categories showed non-significant findings.
Removal of 'C'-rated trials limited each analysis to two to five trials (Fig. 8b): statistically significant effects were marginally retained for allergy and asthma (N = 5) and dermatology (N = 2), and more clearly retained for ear nose and throat (N = 2). No higher-rated trials were available for additional analysis in the cardiovascular and gastroenterology categories. After removal of 'C'-rated trials, there was no change in the non-significance of the statistical findings for each of the other six categories.

Discussion
Seventy-two of the 75 eligible trials had uncertain or high risk of bias. Due to poor reporting or other deficiencies in 21 of the original papers, data extraction for our meta-analysis was possible from only 54 of the 75 trials. Trials with high and with uncertain risk of bias each featured similarly in our 54-trial analysis; the quality of the body of analysed evidence is therefore low. As previously recognised [2,7,9], the pooling of data from diverse clinical conditions, outcome measures and end-points has obvious limitations: thus, a given pooled effect estimate here does not have a clear numerical meaning or relative clinical value, but provides a reasonable summary measure in evaluating the average effect of a medical intervention. Our null hypothesis that regards each trial of non-individualised homeopathy as testing the same intervention also has its limitations, for it makes the debatable assumption that each homeopathic medicine has similar lack of efficacy for the relevant symptoms of every clinical condition. Nevertheless, our separate focus on individualised [2] and nonindividualised homeopathy marks a clear and appropriate step forward.
For our previous meta-analysis of RCTs (on individualised homeopathy [2]), the three most highly ranked trials had minimal risk of bias and were designated reliable evidence. In the current study, we have identified two trials with the highest-quality ranking (' A' = low risk of bias), plus one with minimal risk of bias ('B1*'), which we have examined collectively as the reliable evidence of RCTs of non-individualised homeopathic treatment. Analysis of these three highest-quality trials showed a statistically non-significant pooled SMD of -0.18 (95% CI -0.46, 0.09) (equivalent to pooled OR = 1.39, using the standard conversion [14]). This effect estimate of -0.18 contrasts with that for all 54 analysable trials of -0.33 (equivalent to OR = 1.82): the latter represents a small and statistically significant treatment effect favouring homeopathy, akin to our pooled findings for the individualised trials [2]. We therefore reject the null hypothesis (non-individualised homeopathy is indistinguishable from placebo) on the basis of pooling all studies, but fail to reject the null hypothesis on the basis of the reliable evidence only. Our risk-of-bias analysis and the meta-regression, however, indicate that effect estimates do not significantly differ depending on whether the meta-analysis consists of 'low' , 'uncertain' or 'high' risk-of-bias studies.
Lack of clear conclusion above might simply be due to there being too few high-quality trials. With only three Since the completion of our defined literature search, we are aware of recently published and potentially eligible RCT papers, whose findings we have yet to explore [26][27][28][29]. The limit of detecting an effect of nonindividualised homeopathy across all trials may be related to a medicine's degree of dilution, since trials using potency ≥12C failed to show a statistically significant pooled effect that favoured homeopathy (see Fig. 6b).  In attempting to formulate a reasonable overarching conclusion, it is important also to highlight other findings from our quality-based analyses. For example, the sensitivity analysis that consecutively excluded the lowest-quality trials showed that studies with lower quality tended to report greater benefits of non-individualised homeopathic intervention than studies with higher quality. That RCTs with a higher risk of bias showed a greater benefit for the homeopathy group supports some previous-though not our own [2]-meta-analysis findings [4,7,10]. Our funnel plot finding of larger effect estimates (in favour of homeopathy) in trials with lower sample size is consistent with observations from RCTs in medicine more widely [30]. A further perspective, based on our trim-and-fill analysis, is that the true pooled effect estimate is likely to be smaller than initially appreciated: we found evidence of publication bias, with an estimated 11 'missing' studies whose results would favour placebo, adjustment for which yielded an attenuated but still-significant pooled effect estimate of -0.16 for the 54 analysable trials. We are also aware that our analysis reflects per-protocol-not the potentially more robust (but less available) ITT-outcome data, which might have slightly magnified our pooled effect estimate; however, we have addressed the possible impact of incomplete data in rigorous risk-of-bias assessments, as recommended by Cochrane [31]. The sum of these comments supports a generalised conclusion that a non-individualised homeopathic medicine is indistinguishable from a placebo, but the quality of the evidence is low.  A small and erratic treatment effect in this context may be consistent with the notion that a pre-selected homeopathic medicine, aiming to treat the typical symptoms of a clinical condition, and given to all of the relevant trial participants, may match sub-optimally the 'total symptom picture' for an important number of them, leading potentially to diminished efficacy. The quality of the clinical intervention and the suitability of the main outcome measure are the key facets of a trial's model validity, i.e. the extent to which a study reflects best clinical practice in that intervention [32]. Thus, to complete the quality evaluation of homeopathy trials, it is important to accommodate also the assessment of their model validity, emphasising in this case the three trials comprising reliable evidence in non-individualised homeopathic treatment.
We report separately our model validity assessments of these trials 5 , evaluating consequently their overall quality based on a GRADE-like principle of 'downgrading' [14]: two trials [23,25] rated here as reliable evidence were downgraded to 'low quality' overall due to the inadequacy of their model validity; the remaining trial with reliable evidence [24] was judged to have adequate model validity. The latter study [24] thus comprises the sole RCT that can be designated 'high quality' overall by our approach 5 , a stark finding that reveals further important aspects of the preponderantly low quality of the current body of evidence in non-individualised homeopathy.
Analysis by clinical condition, and following removal of 'C'-rated studies, showed a statistically significant treatment effect in RCTs of non-individualised homeopathy for influenza, and in the categories allergy and asthma, dermatology, and ear nose and throat. None of these analyses included any reliable evidence, however. While these clinical categories do not provide compelling evidence for non-individualised homeopathic treatment, they may contain the most promising targets for future research.

Conclusions
There was a small, statistically significant, effect of nonindividualised homeopathic treatment. However, the finding was not robust to sensitivity analysis based solely on the three trials that comprised reliable evidence: the effect size estimate collectively for those three trials was not statistically significant. There was significant evidence of publication bias in favour of homeopathy. Our meta-analysis of the current reliable evidence base therefore fails to reject the null hypothesis that the outcome of treatment using a non-individualised homeopathic medicine is not distinguishable from that using placebo. Nevertheless, the risk-of-bias analysis and the metaregression, together with the large preponderance of low-quality evidence, challenge the inference that effect size estimates differ significantly depending on risk-ofbias rating. The assessment of a trial's model validity should also be taken into account in an evaluation of overall study quality in homeopathy. Reliable evidence is lacking for all clinical conditions whose data have enabled separate meta-analysis. Higher-quality RCT research on specified homeopathic medicines is required to enable more decisive interpretation regarding efficacy for given clinical symptoms or conditions. Future trialists need to minimise their studies' risk of bias in all domains, and to improve the clarity of their reporting. Such research might wisely focus on trial design in which only patients that match the relevant 'symptom picture' or match the indications of the selected homeopathic product are those eligible to participate: large trials are therefore indicated.