 Research
 Open Access
 Published:
A survey of methodologies on causal inference methods in metaanalyses of randomized controlled trials
Systematic Reviews volume 10, Article number: 170 (2021)
Abstract
Background
Metaanalyses of randomized controlled trials (RCTs) have been considered as the highest level of evidence in the pyramid of the evidencebased medicine. However, the causal interpretation of such results is seldom studied.
Methods
We systematically searched for methodologies pertaining to the implementation of a causally explicit framework for metaanalysis of randomized controlled trials and discussed the interpretation and scientific relevance of such causal estimands. We performed a systematic search in four databases to identify relevant methodologies, supplemented with handsearch. We included methodologies that described causality under counterfactuals and potential outcomes framework.
Results
We only identified three efforts explicitly describing a causal framework on metaanalysis of RCTs. Two approaches required individual participant data, while for the last one, only summary data were required. All three approaches presented a sufficient framework under which a metaanalytical estimate is identifiable and estimable. However, several conceptual limitations remain, mainly in regard to the data generation process under which the selected RCTs rise.
Conclusions
We undertook a review of methodologies on causal inference methods in metaanalyses. Although all identified methodologies provide valid causal estimates, there are limitations in the assumptions regarding the data generation process and sampling of the potential RCTs to be included in the metaanalysis which pose challenges to the interpretation and scientific relevance of the identified causal effects. Despite both causal inference and metaanalysis being extensively studied in the literature, limited effort exists of combining those two frameworks.
Background
Evidencebased medicine is an approach to medical practice defined as conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients in the light of their personal values and beliefs [1]. On a clinical research level and for pertinent research questions, randomized controlled trials (RCTs) undoubtedly offer the highest level of evidence compared to other study designs. An RCT’s primary objective is the minimization of biases, such as selection or allocation biases, by randomizing participants into the study groups in an unbiased fashion. If randomization is successful, the characteristics of the groups are expected to be equally allocated, therefore making the groups exchangeable. Under a complete protocol adherence and no losstofollowup, this property of the RCTs essentially justifies the interpretation of the studied associations as the best available proxy of causal relationships [2]. When a randomized design is not feasible, data from observational study designs can be used to emulate a randomized experiment based on causal inference approach to obtain a valid causal estimate [3, 4]. Under a causal inference framework, the goal is to identify and compute that effect estimate that has a causally relevant interpretation on the population the trial samples from.
Metaanalysis is a quantitative procedure of assessing and combining data from multiple studies. By combining evidence from RCTs using metaanalytical approaches, one can potentially achieve higher levels of evidence. One caveat of this approach is that while each study’s estimate can potentially have a causal interpretation, their aggregation may lose this capacity, mainly due to differences on inherent study characteristics including (but not restricted to) differences in populations, in treatments and/or on the definition of outcome across studies.
In the present effort, we aim to identify and review methodologies relevant to implementation of a causally explicit framework for metaanalysis and discuss the interpretation and scientific relevance of that causal estimand. Therefore, in the first part, we focus on the published methodologies that address the identification and estimation of causal effects derived from metaanalyses of RCTs along with the underlying assumptions. In the second part, we go one step back in order to discuss the plausibility of the assumptions and issues concerning on the generalizability and scientific relevance of the derived estimands.
Methods
Definitions
We briefly present the causal inference framework known as the Rubin causal model [3, 5]. Let Y denote the outcome of interest and let T denote the treatment in a randomized trial. For simplicity, let the treatment take the values 1 for treated and 0 for untreated. Then, Y_{i}^{1} denotes the potential outcome (counterfactual) of unit i under treatment, while Y_{i}^{0} the potential outcome of unit i under no treatment. The quantity Y_{i}^{1} – Y_{i}^{0} denotes the difference in potential outcomes for unit i, the individual treatment effect (ITE). The quantities Y_{i}^{1} and Y_{i}^{0} can never be simultaneously observed for the same individual. The fact that for each i one of the Y_{i}^{1} and Y_{i}^{0} is always missing prohibits us from estimating the ITE. This problem is known as the “fundamental problem of causal inference” [5]. While ITEs are never observable one can estimate the average treatment effect (ATE). Under the assumptions presented in Table 1, one can use the observed quantity E[Y_{i}^{1}  T = 1] – E[Y_{i}^{0} T = 0] as an estimator of the ATE = E[Y_{i}^{1}] – E[Y_{i}^{0}].
If these assumptions hold, one can make valid causal inferences. The positivity and ignorability assumptions are often considered together and are referenced as the strong ignorability assumption.
Search algorithm, inclusion, and exclusion criteria
We performed a systematic search in four databases (Wed of Science, PubMed, Arxiv, and Google Scholar) using the search algorithm: “(causal* OR causat*) AND (meta–analysis OR metaanalysis OR “meta analysis” OR multilevel OR multilevel OR “multi level” OR hierarchical OR meta–synthesis OR “meta synthesis” OR metasynthesis)” from inception to April 2020 to identify relevant methodologies. The search was supplemented with manual searches and reference screening of all relevant studies. We focused on identifying studies that presented a causally explicit framework under a metaanalysis model for randomized controlled trials. We included methodologies that described causality under counterfactuals and potential outcomes frameworks. Studies that claim causality using only the Bradford–Hill criteria were excluded. We also excluded studies on Granger causality, which is more pertinent to prediction than causation. Studies that describe causal inference approaches which are not pertinent to evidence synthesis and application studies were excluded.
Results
Systematic methodology review
The search algorithm yielded a total of 17,280 titles. After initial screening, a total of 256 articles were screened in fulltext for eligibility. Finally, only three distinct methodologies from four publications [6,7,8,9] describing for a causal inference framework in a metaanalysis setting were included in this review (Fig. 1). Two methodologies were based on a metaanalysis setting using individual participant data and the last one on a network metaanalysis setting using summary data. Below, we provide a brief description of those methodologies.
Causal inference for metaanalysis using IPD data from independent RCTs
Sobel et al. [6] described a framework where causal estimates can be derived from a metaanalysis of RCTs when individual participant data (IPD) are available. The authors focus their work on identifying and accounting for possible sources of heterogeneity across trials. They restrict their focus on four possible sources of heterogeneity across trials: response inconsistency, nonequivalent treatments, nonignorable treatment assignment, and variability in the composition of units in different studies or settings. Identifiability conditions taken into account in Sobel et al. approach are presented in Table 2.
Notation
Let t denote the treatment with t ∈ T = (1,…, L) where T is a finite set of treatments. Then, T_{s} denotes the set of treatments in study s. Let s denote the trial with s ∈ S = (1,…, m) where S is a finite set of trials. Let X_{i} be a set of observed covariates, including both subjectlevel and triallevel covariates for subject i. Y_{i} ≡ Y_{i}(s_{i}; t_{i}) is the observed outcome for subject i in study s under treatment t.
The authors extend the potential outcome framework to multiple studies by considering the potential outcomes a subject would have had should he/she participated in a different trial. Let t = (t_{1},…, t_{n}) and s = (s_{1},…,s_{n}), where t_{i} ∈ T, s_{i} ∈ S, i = 1,…,n, and let Y_{i}(s; t) denote the response subject i would have under the allocation s to studies and assignment t to treatments.
Assumption A1 is the extension of SUTVA under multiple trials and treatments. A2 denotes that each trial includes a random sample from its respective population. A3a and A3b state that the effect of a treatment is the same across studies unconditionally or conditional on covariates. A4 further weakens these assumptions stating that the relative effect of treatment t versus treatment t’ is the same across studies, unconditionally or conditional on covariates. A5a and A5b denote that although different versions of the treatment may exist, their effect on the potential outcomes is equivalent. This allows several treatments to be grouped together. A6 is the classic unconfoundedness assumption. If all studies are randomized trials, this assumption is expected to hold unconditionally and conditional on X_{1}. Based on A7, the authors explicitly acknowledge that different trials may sample their subjects from different populations, but assume that given a set of covariates X_{2}, subject assignment into trials is unconfounded. The authors comment that some of these assumptions are untestable by themselves, but if a number of those are assumed to hold, one can then test the plausibility of them holding given the other assumptions holding. Overall, this framework does not use a complex analytical approach, rather is being based on the plausibility of the aforementioned assumptions to hold and by a correct model specification using study level covariates and possibly treatment, study, and covariates interactions. The authors applied a standard Cox model to estimate the causal effect which they justified based on A7.
Dahabreh et al. [7, 8] proposed a causal inference framework under which metaanalysis estimates are causally interpretable and transportable ATEs to a target population. This approach requires IPD from the randomized trials along with baseline covariate data from a random sample from the target population, in order to account for differences in distributions. They provide a set of assumptions for identifiability conditions and also propose an estimand that takes into account the distributional differences between trials and target population (Table 3). This framework assumes that the observed data are obtained by random sampling from an infinite superpopulation of individuals which is stratified by study S. Authors denote this sampling method as a “biased” sampling since the proportion of the sampled population is not expected to be equal with the superpopulation due to convenience sampling in the majority of the RCTs. This framework assumes complete adherence to the trial protocol and no losstofollowup, leading to the intentiontotreat effect being equal to the perprotocol effect.
Notation
Let t denote the treatment with t ∈ T = (1,…, L) where T is a finite set of treatments. Let s denote the trial with s ∈ S = (1,…, m) where S is a finite set of trials. Let also S = 0 denote the nonrandomized target population. Let X be a set of observed baseline covariates. E[Y^{t} − Y^{t’}∣ S = 0] denotes ATE in the target population for the treatments t and t’.
B1 implies that the treatment effect is consistent irrespective of trial participation. Conditions B2 and B3 are expected to hold within trials due to randomization. B4 implies that there is no trial effect affecting ATE conditional on the baseline covariates X. Finally, B5 implies that the probability of observing covariate patterns based on which B4 stands, should be nonzero. Under B1–B5, inferences are transportable from each trial to the target population. Specifically, under B4 and B5, the ATE is independent of study participation in S, within strata of baseline covariates. When multiple trials are pooled, the positivity assumption B5 can be relaxed assuming that trialspecific conditional ATE is equal to the conditional ATE of the target population under a subset of baseline covariates X and that the probability of covariate patters occurring under different trials is nonzero.
All conditions B1–B5 assume that there is perfect adherence and compliance in all trials and there is no trial attrition. The authors also provided extensions to the conditions in Table 2 under specific setting where this assumption does not hold.
Based on the above set of identification conditions, Dahabreh et al. [7, 8] provided two estimation approaches. The first approach models the conditional ATE directly from the pool of the trials and baseline covariates. The second estimation method is based on a weighting estimator, that is, applying trialspecific weight based on the probability of trial participation and treatment (see [7, 8] for details). In both cases, the authors suggest that Waldtype or bootstrapbased confidence intervals can be derived. The authors provided code in R for implementation of the above estimands.
Causal inference for network metaanalysis using summary data from independent RCTs
Schnitzer et al. [9] described a framework where causal estimates can be derived in a network metaanalysis setting by using aggregate data from multiple RCTs. This approach focuses on estimating an average treatment effect under the presence of heterogeneity rising from differences in studylevel characteristics. The authors define a marginal and modelindependent causal estimand and outline the key assumptions that are required for this estimand to be identifiable under measured study–level confounding. An armbased network metaanalysis approach is adopted throughout the paper which estimates the armspecific effects, in contrast to the studybased approach that estimates the studyspecific effects.
In the armbased approach, the authors assume that each trial samples randomly from their respective population and that, due to randomization, each trial arm is representative of their population. Then, the authors define their superpopulation (coining the term metapopulation) as the union of the trials’ populations. Due to differences across trial, the authors assume that each trial may not be representative of the superpopulation and that each trial estimates its own effect. Therefore, in order to account for differences across trials, one would have to adjust for variables that contribute to differential treatment selection and to the outcome distribution.
Regarding the computational part, in total, three estimation methods (G–computation, inverse probability of treatment weighting (IPTW), and targeted minimum loss–based estimation (TMLE)) are presented and compared in a simulation study. Briefly, in the G–computation method, a maximum likelihood substitution estimator of the G–formula [10], the authors start by fitting a regression model that estimates the outcome of each arm i in study j using all arm regardless of treatment assignment. In the next step, the predicted study effect under each treatment is estimated and a mean effect across studies is derived. Finally, the standard error of the G–computation estimate is derived by bootstrap methods. The disadvantage of this approach is that the correct model specification is challenging. The second method is based on IPTW where a propensity score that estimates the probability of each arm receiving the treatment is computed and this probability is used as weights to create a “pseudo–population” of study arms that are free from confounding bias from the studylevel confounders. The disadvantage of this approach is that the number of study arms must be sufficiently large. IPTW using propensity scores is also subject to correct model specification. The TMLE is a doubly robust method that involves the estimation of armbased effects by fitting a model for the expected value of the armbased means and by obtaining the predictions of studyspecific effects under treatment. In the next step, these predictions are updated by a nointercept logistic regression using only the arms under treatment and a single covariate corresponding to the estimated probability form the IPTW method. Under a correct model specification for the propensity score, this method provides a consistent estimate. Based on a simulation study, under a correctly specified model, the G–computation method had the performance followed by the TMLE. G–computation and TMLE methods where more sensitive to model misspecification than IPTW; however, the latter was mode biased and had a larger variance when the number of studies was small.
Conceptual framework considerations
Data generation process and sampling for common, fixed, and random effects
As we already showed, approaches to identification of causal effects under a meta–analytical framework, although scarce, do exist in the literature. However, there exists a more fundamental problem that seems to not have attracted enough attention. This problem pertains to the actual scientific relevance and/or clinical applicability of an otherwise valid causal estimate. This problem directly translates to the specification of the data generation process of the study and participants.
In the medical literature, metaanalysis has been a useful tool for summarizing the plethora of evidence in any specific topic. These two prevalent approaches in undertaking a metaanalysis are commonly known as the fixedeffect and the randomeffects models. What is usually overlooked is that there exist in fact two distinct sets of assumptions that lead to the same estimator derived from a fixedeffect model [11] which are denoted as the common effect model and the fixedeffects model. The commoneffect model is well known in the literature; it is the “classic” fixedeffect metaanalysis model. In this model we assume that all identified studies are trying to estimate one commoneffect θ and that all differences between studies are attributed exclusively to the sampling error. The fundamental assumption of this model is that all studies use data from their populations who in turn are random samples from the same superpopulation. Therefore, under the common effect assumption, the estimator \( \hat{\theta} \) represents the weighted average estimate of θ from the several studies. The exact same estimator however can arise from an entirely different set of assumptions, denoted here as the fixedeffect model. Under this particular model, much like as in the randomeffects model, each study’s effect is an estimate of its own θ_{i}. The difference from the randomeffects model is that under the fixedeffects model, we assume that θ_{i}s are unrelated. This model merely states that each study estimates its own effect irrespective from the other study effect. In contrast, under the randomeffects model, we assume that all study effects are a sample from the distribution of study effects. This means that each study’s effect, albeit different from the other studies’ effects, rises from the same distribution of effects, governed by parameters or characteristics of the mixture of distributions. The difference from common/fixedeffects is that by using randomeffects, we shift the focus from describing the intervention effect on the underlying superpopulation to describing the characteristics of the distribution of the effect sizes. These three metaanalytical approaches in fact assume three distinct data generation processes and we argue that, depending on which approach one assumes, the pooled estimate for the causal treatment effect may be biased.
Interpretation and scientific relevance of the identified causal effects
Sobel et al. [6] made no remark on the choice of trials included in their metaanalysis. However, based on A7, they consider that each trial samples from its distinct population, which implies that the superpopulation is a mixture of each trial’s population. Schnitzer et al. [9] explicitly stated that the subjects in each trial are assumed to be random samples of their own populations and furthermore define their superpopulation (metapopulation) as the union of each trial’s population. Essentially, these two methodologies assumed that each trial includes a random sample of their specific population and implicitly or explicitly assume that the underlying population of all trials in the union of the trialspecific populations. These descriptions of the superpopulations are in line with the fixedeffects and the randomeffects metaanalysis models but not with the common effect model. This does not necessarily translate to the methodologies being completely incompatible with the commoneffect model. For example, in Sobel et al. (assuming that the assumptions A1, A2, and A6 hold), in the special case where assumptions A3a and A5a hold, as well as all studies sample randomly from the sample superpopulation (i.e., the assumption A7 hold unconditionally of the X_{2}) and then it essentially collapses to the commoneffect model, irrespective of the model of the effect size H specified in the assumption A4. A more realistic scenario, however, is when the assumption A7 holds conditionally, as described in the original paper, where the sample populations are not necessarily equivalent to the superpopulation. Since the set of covariates X_{2} may include both individual and studylevel covariates, then it seems even more unlikely for A7 to hold unconditionally, since this would mean at least equivalence of the distributions of the individuallevel covariates across studies. On the contrary, the approach of Dahabreh et al. [7, 8] explicitly assumed that the data were obtained by random sampling from an infinite superpopulation (stratified by study S). Dahabreh et al. noted that this procedure would lead to a “biased” sample, i.e., the probability of individuals being included in a study differs between the infinite superpopulation and the sampled data, but stated that the identicality of the causal effects is unaffected. The authors acknowledge such a hypothetical “infinite superpopulation” is (similar to every frequentist approach for statistical inference) more of a convenience than a likely existing population of individuals. The same holds for the superpopulation/metapopulation defined in the other methods. Even then, such a superpopulation of individuals, seems more plausible compared to an infinite population of effect sizes from which the observed study effects are sampled from, under the randomeffects model [7, 8, 12]. Not only is it unlikely that such a population of effect sizes exist, but also the interpretation of the summary effect pertains more to the characterization of the said distribution that to the description of the effect on the superpopulation.
Sobel at al. did not explicitly address how their approach corresponds to either common, fixed, or random effects. Instead, the model to be used for the estimation of the effect size was represented by a generalized function H. Depending on the statistical model described by the function H, all three metaanalysis models can be applied. In the provided example, the model used is similar to a onestage IPD metaanalysis and equivalent to the common and fixedeffects models. Schnitzer et al. also did not make any explicit assumptions regarding the compatibility of the proposed approach with the three metaanalysis models. However, the simulation study that they performed to compare the efficacy of the estimators only focused on randomeffects methods. Finally, the focus of the methodological approach described by Dahabreh et al. is the identification and estimation of a valid causal effect which can be applicable to a specific target population with specific characteristics. Therefore, it is not directly equivalent to any of the three usual metaanalysis models whose primary focus leans towards the generalizability rather that the transportability of the effects.
All methodologies provide a valid approach to estimate an effect that has a causal interpretation in the respective superpopulations. However, it is also important to consider whether these superpopulations are actually plausible populations that exist naturally. In that end, one has to consider the data generation process of the trials. While it is often reasonable to assume that each trial samples randomly from their respective populations, we have to acknowledge that in most cases it seems implausible that these same trials are a random sample of a population of trials. Had we had a random sample of trials, then this sample would allow for the covariate distribution of the superpopulation to be consistent with the covariate distribution of a naturally existing population. However, this is rarely the case, as conducting a trial is largely a function of very specific motives and aims [13, 14], and one may argue that a random sample of trials may not naturally occur. And while this amalgamation of trials does not directly affect the underlying assumptions invoked by a fixedeffects or a random effect metaanalysis, which pertain to the sampling of the effect estimates rather than the sampling of trials [13, 15], we argue that, without taking into account any potential differences between the structure of the superpopulation and that of a naturally occurring population, the generalizability of the produced results (which would have a pertinent causal interpretation for the superpopulation) would be hindered. Therefore, one must be very careful regarding the generalizability of the metaanalytical causal estimates. One approach would be to ignore this by implicitly or explicitly assuming that the metaanalytical causal estimate for superpopulation is the same or sufficiently close to the actual estimate for the natural population. Alternatively, one could acknowledge this caveat of the generalizability but describe the estimate nonetheless. The obtained estimate would still be the best available description of the effect. The approach of Dahabreh et al. partially ameliorates this by following an alternative approach which focuses on assumption for trial populations rather that assumptions of the trial effects, as discussed earlier.
Discussion
In this work, we reviewed published methodologies pertaining to causally explicit description of metaanalyses of RCTs and other similar evidence synthesis frameworks, such as multilevel or hierarchical frameworks with regard to obtaining a causally interpretable metaanalysis estimate. Overall, we identified three methodologies directly pertinent to a causally explicit description of a metaanalytical estimate.
The first methodology [6] provided a set of 7 causal inference assumptions under which a metaanalytical causal effect is identifiable and estimable. This methodology required individual participant data from all trials to work, similar to a onestage metaanalysis. However, this methodology differs computationally from a “classic” onestage metaanalysis in that it only fits a regression model in contrast to the onestage metaanalysis where it is standard to use a hierarchical model for estimation. The second methodology [7, 8] was also based on individual participant data from trials and baseline data from a target population using a welldefined causal inference framework. A drawback of this approach is that it requires baseline data from the target population on which the causal effects are to be transported. As it is often unlikely for data from the target population to be readily available, the actual applicability of this approach seems somewhat limited. Finally, the last methodology [9] focused on summary data from network metaanalysis. This approach tries to account for differences across trials in order to estimate a marginal causal effect that refers to a superpopulation defined as the union of the populations the trials sample from. Overall, the methodologies presented in this review address two distinct aspects of statistical inference, the generalizability of the effects in the population from which the data are sampled from and the transportability of the effects, which allows for inferences to a new target population. While both are equally important, our review focused more on generalizability, which addresses the internal validity of the estimates and is more often the focus in the literature of metaanalyses of RCTs.
An inherent limitation of such approaches is the data generation process for the trial selection. Although all methodologies provide valid causal estimates, these are restricted to their superpopulations respectively. As it is rarely the case that the actual trials are random samples from a population of trials that in turn sample randomly form a naturally occurring population, it is evident that these superpopulations may differ substantially from a naturally occurring population. While this problem may not hamper the interpretation of the results from a “classic” metaanalysis where no causal interpretation is made, that is, the description of the distribution of the effect estimates (for randomeffects metaanalysis), it would severely hinder any efforts for a causal interpretation of the said results. Therefore, although helpful in summarizing and providing the best description available for this evidence, caution is needed when trying to make inferences. One would have to consider the differences in the structure between the two (naturally occurring and theoretically constructed) superpopulations and consider whether the estimated causal effect is relevant. Only the approaches by Dahabreh et al. [7, 8] recognized this limitation in the data generation process and provided a limited solution under certain assumptions.
There is an extensive literature on the extensions of causal inference focusing on complex singletrial settings [16, 17] on surrogate endpoints [18], or on nonrandomized data (observational or quasiexperimental settings) using multilevel or hierarchical frameworks [19]. A comprehensive review of multilevel models for causal inference focusing on randomized experiments in education was recently published [20]. Finally, efforts to synthesize data from multiple sources (observational and experimental) also exist [21]. However, to our knowledge, this is the first systematic effort to survey relevant methodologies which explicitly expand the causal inference framework to incorporate data from multiple trials by using an evidence synthesis approach.
Other approaches exist in evidence synthesis that can be used to investigate causality. A causal inference approach in metaanalysis of RCTs focusing on a slightly different scientific question to what was presented in our review was addressed in Zhou et al. [22] who focused on the estimation of the complier average causal effect (CACE) based on metaanalysis of RCTs with noncompliance. Under the principal stratification framework [23] which takes into account the noncompliance in trials, CACE is an alternative to ATE for the estimation of causal effects. Zhou et al. [22] presented a novel approach in estimating CACE based on a Bayesian hierarchical model by taking into account the studyspecific randomeffects to account for heterogeneity across trials. Although this study provided insights to the CACE estimation from metaanalysis, it did not provide an explicit causal inference framework, but only reflected upon the singletrial assumptions of the principal stratification framework, leaving aside key components, such as the exchangeability across trials. Mendelian randomization studies are also alternative approaches based on evidence synthesis which can be used to investigate causality [24]. However, the key difference of these methodologies and the ones presented in this paper is that in Mendelian randomization studies, one starts based on the key assumption that the studied association is causal and then proceeds to synthesize the available data. In contrast, causal inference approaches, such the ones presented in this paper, aim to identify potential causal relationships in an observed association.
Conclusions
Despite both causal inference methodology and metaanalysis of randomized controlled trials being regarded as two of the most useful tools in refining evidencebased hierarchy, there is only limited effort in the bibliography to combine these approaches in order to attain higher levels of causally interpretable evidence. To date, only a limited number of methodological frameworks have addressed this issue, providing ways to obtain causal estimates from metaanalyses of randomized controlled trials. And while all three identified methodologies would produce a valid causal estimate, due to potential violations of study protocols and limitations in the assumptions regarding the data generation process of the potentially included in the metaanalysis RCTs, the interpretation and generalizability of the causal estimands may prove challenging.
Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
Abbreviations
 ATE:

Average treatment effect
 CACE:

Complieraverage causal effect
 ITE:

Individual treatment effect
 RCT:

Randomized controlled trial
 IPD:

Individual participant data
 IPTW:

Inverse probability of treatment weighting
 SUTVA:

Stable unit treatment value assumption
 TMLE:

Targeted minimum lossbased estimation
References
 1.
Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312(7023):71–2. https://doi.org/10.1136/bmj.312.7023.71.
 2.
Hernán MA. A definition of causal effect for epidemiological research. J Epidemiol Community Health. 2004;58(4):265–71. https://doi.org/10.1136/jech.2002.006361.
 3.
Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66(5):688–701. https://doi.org/10.1037/h0037350.
 4.
Hernán MA, Robins JM. Using big data to emulate a target trial when a randomized trial is not available. Am J Epidemiol. 2016;183(8):758–64. https://doi.org/10.1093/aje/kwv254.
 5.
Holland PW. Statistics and causal inference. J Am Stat Assoc. 1986;81(396):945–60. https://doi.org/10.1080/01621459.1986.10478354.
 6.
Sobel M, Madigan D, Wang W. Causal Inference for metaanalysis and multilevel data structures, with application to randomized studies of Vioxx. Psychometrika. 2017;82(2):459–74. https://doi.org/10.1007/s113360169507z.
 7.
Dahabreh IJ, Steingrimsson JA, Robertson SE, Petito LC, Hernán MA. Efficient and robust methods for causally interpretable metaanalysis: transporting inferences from multiple randomized trials to a target population. arXiv eprints. 2019;2019:arXiv:1908.09230 Available from: https://ui.adsabs.harvard.edu/abs/2019arXiv190809230D.
 8.
Dahabreh IJ, Petito LC, Robertson SE, Hernán MA, Steingrimsson JA. Toward causally interpretable metaanalysis: transporting inferences from multiple randomized trials to a new target population. Epidemiology. 2020;31(3):334–44. https://doi.org/10.1097/EDE.0000000000001177.
 9.
Schnitzer M, Steele R, Bally M, Shrier I. A Causal Inference Approach to Network MetaAnalysis. J Causal Inference. 2016;4(2):20160014. https://doi.org/10.1515/jci20160014.
 10.
Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7(912):1393–512. https://doi.org/10.1016/02700255(86)900886.
 11.
Rice K, Higgins JPT, Lumley T. A reevaluation of fixed effect(s) metaanalysis. J R Stat Soc Ser A Stat Soc. 2018;181(1):205–27. https://doi.org/10.1111/rssa.12275.
 12.
Hernán MA, Robins JM. Causal inference: what if, vol. 2020. Boca Raton: Chapman & Hill/CRC; 2020.
 13.
Colditz GA, Burdick E, Mosteller F. Heterogeneity in metaanalysis of data from epidemiologic studies: a commentary. Am J Epidemiol. 1995;142(4):371–82. https://doi.org/10.1093/oxfordjournals.aje.a117644.
 14.
Peto R. Why do we need systematic overviews of randomized trials? Stat Med. 1987;6(3):233–44. https://doi.org/10.1002/sim.4780060306.
 15.
Higgins JP, Thompson SG, Spiegelhalter DJ. A reevaluation of randomeffects metaanalysis. J R Stat Soc Ser A Stat Soc. 2009;172(1):137–59. https://doi.org/10.1111/j.1467985X.2008.00552.x.
 16.
Balzer LB, Zheng W, van der Laan MJ, Petersen ML. A new approach to hierarchical data analysis: targeted maximum likelihood estimation for the causal effect of a clusterlevel exposure. arXiv eprints. 2017;2017:arXiv:1706.02675 Available from: https://ui.adsabs.harvard.edu/abs/2017arXiv170602675B.
 17.
Forastiere L, Mealli F, VanderWeele TJ. Identification and estimation of causal mechanisms in clustered encouragement designs: disentangling bed nets using Bayesian principal stratification. J Am Stat Assoc. 2016;111(514):510–25. https://doi.org/10.1080/01621459.2015.1125788.
 18.
Van der Elst W, Molenberghs G, Alonso A. Exploring the relationship between the causalinference and metaanalytic paradigms for the evaluation of surrogate endpoints. Stat Med. 2016;35(8):1281–98. https://doi.org/10.1002/sim.6807.
 19.
Verbitsky N. Associational and causal inference in spatial hierarchical settings: theory and applications; 2007.
 20.
Raudenbush SW, Schwartz D. Randomized experiments in education, with implications for multilevel causal inference. In: Reid N, Stigler SM, Louis TA, editors. Annual review of statistics and its application, vol. 7. Palo Alto: Annual Reviews; 2020. p. 177–208.
 21.
Wang C, Rosner GL. A Bayesian nonparametric causal inference model for synthesizing randomized clinical trial and realworld evidence. Stat Med. 2019;38(14):2573–88. https://doi.org/10.1002/sim.8134.
 22.
Zhou J, Hodges JS, Suri MFK, Chu H. A Bayesian hierarchical model estimating CACE in metaanalysis of randomized clinical trials with noncompliance. Biometrics. 2019;75(3):978–87. https://doi.org/10.1111/biom.13028.
 23.
Frangakis CE, Rubin DB. Principal stratification in causal inference. Biometrics. 2002;58(1):21–9. https://doi.org/10.1111/j.0006341X.2002.00021.x.
 24.
Smith GD, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003;32(1):1–22. https://doi.org/10.1093/ije/dyg070.
Acknowledgements
Not applicable.
Funding
Not applicable.
Author information
Affiliations
Contributions
GM and EEN designed this study. GM and EEN performed the literature search and the selection of studies. GM wrote the manuscript. GV and EEN provided comments on the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Markozannes, G., Vourli, G. & Ntzani, E. A survey of methodologies on causal inference methods in metaanalyses of randomized controlled trials. Syst Rev 10, 170 (2021). https://doi.org/10.1186/s13643021017261
Received:
Accepted:
Published: