A comparison of meta-analytic methods for synthesizing evidence from explanatory and pragmatic trials
Systematic Reviews volume 7, Article number: 19 (2018)
The pragmatic–explanatory continuum indicator summary version 2 (PRECIS-2) tool has recently been developed to classify randomized clinical trials (RCTs) as pragmatic or explanatory based on their design characteristics. Given that treatment effects in explanatory trials may be greater than those obtained in pragmatic trials, conventional meta-analytic approaches may not accurately account for the heterogeneity among the studies and may result in biased treatment effect estimates. This study investigates if the incorporation of PRECIS-2 classification of published trials can improve the estimation of overall intervention effects in meta-analysis.
Using data from 31 published trials of intervention aimed at reducing obesity in children, we evaluated the utility of incorporating PRECIS-2 ratings of published trials into meta-analysis of intervention effects in clinical trials. Specifically, we compared random-effects meta-analysis, stratified meta-analysis, random-effects meta-regression, and mixture random-effects meta-regression methods for estimating overall pooled intervention effects.
Our analyses revealed that mixture meta-regression models that incorporate PRECIS-2 classification as covariate resulted in a larger pooled effect size (ES) estimate (ES = − 1.01, 95%CI = [− 1.52, − 0.43]) than conventional random-effects meta-analysis (ES = − 0.15, 95%CI = [− 0.23, − 0.08]).
In addition to the original intent of PRECIS-2 tool of aiding researchers in their choice of trial design, PRECIS-2 tool is useful for explaining between study variations in systematic review and meta-analysis of published trials. We recommend that researchers adopt mixture meta-regression methods when synthesizing evidence from explanatory and pragmatic trials.
Randomized controlled trials (RCTs) are cited as the highest level of evidence that can inform clinical and policy decisions about the efficacy and/or effectiveness of an intervention [1,2,3]. However, RCTs are generally costly, with many stringent inclusion and exclusion criteria which limit generalizability of results and relevance to routine clinical practice. Consequently, there is increased interest in designing RCTs that show real-world effectiveness of an intervention in broad patient populations [4,5,6,7]. Schwartz and Lellouch  proposed a distinction between explanatory trials, which confirm a physiological or clinical hypothesis, and pragmatic trials, which inform a clinical or policy decision by providing evidence for adoption of the intervention into real-world clinical practice. Since their seminal paper, several papers have investigated the strengths and limitations of pragmatic trials [4,5,6,7,8,9,10,11,12]. Thorpe et al. [13, 14] proposed the original PRECIS (pragmatic–explanatory continuum indicator summary) tool that further clarified the concept and features of pragmatism and a scoring system and visual representation of the graphical representation of the pragmatic features of a trial. Loudon et al. [15, 16] later proposed a revision of the PRECIS, called PRECIS-2, a 9-item tool to assess the characteristics of a pragmatic design. Features of the PRECIS-2 tools include the recruitment of investigators and participants, the intervention and its delivery, follow-up, and the determination and analysis of outcomes. Many trials could be deemed to be pragmatic with regard to at least one of these dimensions, but few are truly pragmatic on all dimensions.
A number of studies have explored the use of PRECIS instruments when synthesizing evidence from published trials. For example, Patsopoulos  suggests that “systematic reviews and meta-analyses could incorporate a PRECIS score for synthesized trials and help the systematic mapping of the pragmatism in published research”. Yoong et al.  investigated the impact of pragmatic–explanatory study design characterization on conclusions of systematic reviews of public health interventions of obesity trials. They observed that there were no differences among the intervention effects across classifications of the synthesized studies based on PRECIS ratings. Koppenaal et al.  applied a modified version of PRECIS, called PRECIS review tool, to judge the applicability of studies in systematic reviews for daily clinical practice in two systematic reviews . Tosh et al.  proposed the pragmascope, an adapted version of the PRECIS tool that uses a 5-point scale to assess the degree of pragmatism when designing RCTs in mental health. Witt et al.  conducted a systematic analysis in trials of acupuncture for lower back pain with the intention of applying the PRECIS tool. Glasgow et al.  also used the PRECIS tool to describe the design features of three effectiveness trials investigating weight loss in obese patients with comorbid conditions. More recently, Jordan et al.  also demonstrated the potential benefit of using the PRECIS-2 instrument for aiding systematic review and meta-analysis of studies in hepatitis C virus care. Louma et al.  used PRECIS-2 to identify interventions that effectively increased physical activity and glycemic controls among patients with type 2 diabetes and assess the potential use of PRECIS-2 for implementing physical activity interventions in clinical practice settings.
While the uptake of PRECIS instruments (i.e., PRECIS and PRECIS-2) in systematic reviews is increasing, these instruments are mostly used descriptively but their impact in explaining heterogeneity in meta-analytic investigations has not been investigated. Given the variations in study designs, differences in study characteristics of explanatory and pragmatic trials are likely to influence both statistical heterogeneity and intervention effect estimates in meta-analyses. Aves et al.  argues that “….If heterogeneity is substantial, due to 66the degree of pragmatism, it might not be appropriate to pool data from pragmatic and explanatory trials….” Although modern meta-analytic methods such as mixture meta-regression and robust meta-analytic methods have been developed to pool evidence from heterogeneous populations [26,27,28], there is limited application of these methods and incorporation of PRECIS ratings in synthesizing evidence from explanatory and pragmatic trials.
This study aimed to assess whether the incorporation of PRECIS classification could improve the modeling of heterogeneity among published trials in meta-analytic investigations. Using data from a Cochrane systematic review of 31 trials of community-based obesity intervention in children , we compared the performance of random-effects, stratified random-effects, and mixture random-effects meta-regression techniques that accounted for differences between explanatory and pragmatic trials for synthesizing evidence from published trials.
The pragmatic–explanatory continuum indicator summary (PRECIS-2)
PRECIS was developed by a group of international researchers and methodologists to assist trialists in distinguishing between pragmatic and explanatory trial designs [13, 14]. PRECIS requires trialists to indicate on a visual scale (in the shape of wheel) where a trial falls along the pragmatic–explanatory continuum. More recently, a revision of the PRECIS tool, PRECIS-2, was developed . This consists of nine domains, including eligibility, recruitment, setting, organization, flexibility in intervention delivery, flexibility in adherence, follow-up, primary outcome, and primary analysis. Each domain is rated on a 5-point Likert scale from 1 (completely explanatory) to 5 (completely pragmatic) .
Systematic review of obesity prevention trials
Data were from the Cochrane systematic review of trials that investigated the efficacy or effectiveness of community-based obesity prevention interventions in children . The systematic review included all RCTs published between 1990 and March 2010. Similar to the previous work by Yoong et al. , we used an adapted version of the PRECIS-2 tool to conduct an audit of all 30 trials of children age 6–12 years included in the Cochrane review of obesity trials to assess the pragmatic–explanatory design features of these studies.
Raters and rating procedures
Before rating the trials in this systematic review, three study co-authors (TTS, OA, MW) first read and discussed relevant papers on PRECIS [13,14,15,16] and piloted their knowledge of the PRECIS tool using 5 randomly selected published trials. Raters then independently rated each of the 31 trials on the 9 domains of the PRECIS-2 tools. Each domain was a score on a 5-point scale that range from 1 (completely explanatory) to 5 (completely pragmatic) using the broad definitions provided by the tool developers [15, 16]. Authors then met to discuss variations in scoring and reached a consensus where there were discrepancies. For each investigator and each trial, an overall summary score was derived by averaging of the ratings of the 9 items. High scores indicated a more pragmatic trial, while lower scores indicated the trial is more explanatory. Since no cut-off scores were provided by the original authors, we applied a scoring method for categorizing the trials as explanatory or pragmatic. Specifically, we classified a trial as explanatory if the average score for the trial is less than 3.0, while a trial is considered pragmatic if the average PRECIS-2 is at least 3.0.
Descriptive statistics was used to summarize the average domain-specific and overall PRECIS-2 scores across the 31 studies included in this analysis. Fleiss kappa statistic was used to assess inter-rater reliability among the domain-specific and overall ratings of each trial on the PRECIS-2 scale . Four meta-analytic methods were used to assess the changes in conclusions about overall intervention effect on body mass index of these children. These include (i) the conventional random-effects meta-analysis; (ii) stratified random-effects meta-analysis, in which effect sizes from pragmatic trials and explanatory trials were independently pooled; (iii) random-effects meta-regression that adjusted for PRECIS-2 rating (explanatory vs pragmatic); and (iv) a mixture random-effects meta-regression that adjusted for PRECIS-2 rating (pragmatic vs explanatory). For each model, we report pooled effect size (ES), 95% confidence interval, between-study variance (τ2), and Bayesian information criterion. All analyses, including kappa estimated, were conducted using R software .
Of the 31 studies included in our analysis, 12 trials were focused on physical activity interventions only, 5 focused on dietary interventions only, while 14 adopted a combination of physical and dietary interventions . As reported in the Cochrane review , the reported standardized mean difference in body mass index between the intervention to reduce obesity in children and the controls for these 31 trials ranged between − 0.36 and 0.45 (Fig. 1). The inter-rater reliability among our independent raters, as measured by the Fleiss kappa, ranged between 0.41 and 0.86, indicating moderate to substantial agreement across the domains . We achieved moderate to substantial agreement for all the domains, with flexibility (delivery) and follow-up domains showing lower agreement (κ = 0.41 and 0.48, respectively). The average overall ratings ranged between 2.44 and 4.56. Using a cut-off of 3.0, five studies were classified as explanatory while the remaining 26 studies were classified as pragmatic (see Table 1 and Fig. 2).
Table 2 and Fig. 3 describe the estimates of pooled intervention effect size based on random-effects meta-regression methods for 31 published trials. Conventional pooling of the intervention effects across all the studies based on random-effects meta-analysis suggest statistically significant pooled ES of − 0.15 (95%CI = [− 0.23, − 0.08]). When we fitted stratified meta-analysis independently for pragmatic and explanatory trials, the meta-analysis of the explanatory trials revealed no significant pooled intervention effect for the explanatory trials (ES = − 0.32; 95%CI = [− 0.88, 0.33]) but statistically significant pooled intervention effect for the pragmatic trials (ES = − 0.12; 95%CI = [− 0.19, − 0.06]). Nevertheless, the pooled intervention effect from the explanatory trials was about 2.5 times the pooled effect size obtained from the pragmatic trials. Meta-regression methods that adjusted for overall PRECIS-2 ratings revealed significantly larger pooled effect sizes and smaller τ2 than the conventional overall pooled effect size. Specifically, the random-effects meta regression model that adjusted for PRECIS-2 rating revealed a statistically significant pooled effect size (ES = − 0.79, 95%CI = [− 1.26, − 0.31] that is more than five times larger than the estimated pooled effect size obtained from the conventional meta-analysis model. The mixture random-effects meta-regression model that controlled for PRECIS rating (pragmatic vs explanatory) even revealed substantially large and statistically significant pooled effect size (ES = − 1.05, 95%CI = [− 1.53, − 0.54]) (Fig. 3).
The incorporation of PRECIS-2 classification of the trials as covariate in meta-regression models results in larger estimates of pooled intervention effects in published trials than previously reported pooled effect sizes [19, 29]. Meta-regression methods, such as mixture regression models that controls for PRECIS-2 rating as a covariate, are particularly advantageous in that they can account for heterogeneity among the studies by modeling the heterogeneity attributable to pragmatism using mixture distribution. This finding supports results from previous research that recognize the importance of accounting for between-study heterogeneity that is attributable to the degree of pragmatism in systematic review of published trials [18, 27]. It highlights the utility of PRECIS-2 tool for aiding synthesis of evidence from published trials and the impact of information that design characteristics can have in explaining the between-study heterogeneity when conducting meta-analysis of published studies. We recommend that researchers should not only use PRECIS-2 rating information descriptively in meta-analysis of published studies but also for inferential purposes for modeling heterogeneity when estimating pooled intervention effects.
Also, we found that stratified meta-analytic methods, in which explanatory and pragmatic trials are defined with explicit criteria and independently synthesized, supported the notion that there was an overall significant effect of the community-based obesity intervention in children in pragmatic trials but not in explanatory trials. The estimated pooled effect size obtained from synthesis of pragmatic trials was significantly smaller than the estimated pooled effect size obtained from synthesis of explanatory trials. This finding is in line with previous research that shows explanatory trials often report larger effect sizes than pragmatic trials . One main advantage of stratified meta-regression methodology is that it can help researchers and policy makers understand the strength of evidence for efficacy and/or effectiveness of an intervention in a population. It can also aid policy decision making about an intervention when the direction of pooled effect size in both pragmatic and explanatory trials are in the same direction. However, policy decision making about an intervention based on this stratified meta-analysis approach may not always be straightforward especially when the estimated pooled intervention effect sizes in both explanatory and pragmatic trials are in opposite directions. Few studies have recommended that policy decision making should be based on evidence from pragmatic trials only since they confirm the real-world effectiveness of an intervention [23, 24, 32]. But this recommendation may not be valid when there is limited number of pragmatic trials included in the systematic review due to low statistical power.
Our comparisons of the meta-analytic methods for synthesizing evidence from published trials rely on observed data only. Future research will explore the use of Monte Carlo methods to examine the statistical properties of these methods including their statistical power, bias, mean square error, and coverage under a variety of data analytic conditions. Importantly, the accuracy of the pooled intervention effects obtained from meta-regression analysis hinges on the accuracy of the PRECIS-2 ratings to assess the degree of pragmatism in each trial. While the ratings obtained from the three reviewers in our study had good overall inter-rater agreement, flexibility delivery and follow-up domains of PRECIS-2 exhibited only moderate agreement. This is consistent with previous studies that report high variability or poor agreement on flexibility and/or follow-up domains [18, 19, 23, 24]. Additionally, we had missing scores on some of the PRECIS domains for some studies because the original articles lack this information. Although Yoong et al.  recommend that investigators should endeavor to contact primary authors of each published trial with incomplete information when using PRECIS-2, we did not contact authors of the original articles but derived the mean scores on PRECIS domains and overall score based on all the available scores [18, 19]. Future research will use sensitivity analysis to assess the impact of missing data on estimates of pooled effect sizes from meta-analytic investigations. Moreover, while we have analyzed the overall PRECIS-2 scores for these published trials, the component domains that constitute this overall score may have ratings that vary on the explanatory–pragmatic continuum. The PRECIS domain-specific information about these trials might provide policy makers with relevant information (e.g., about implementation of interventions).
This study shows that the incorporation of information about the type of trials (explanatory or pragmatic), assessed using PRECIS-2 tool, can influence the estimation of pooled intervention effects in meta-analysis of published trials when there is substantial heterogeneity attributable to pragmatism. Secondly, this study also reveals the need for meta-regression methods that, adjusts for PRECIS information as covariate, for estimating pooled intervention effects in meta-analytic investigations. This ensures that valid conclusions are derived from systematic reviews and meta-analysis of published trials. We recommend that meta-analytic investigations in systematic reviews should incorporate information about design characteristics using PRECIS-2 when synthesizing evidence from published studies.
95% confidence interval
Pragmatic–explanatory continuum indicator summary version 2
Randomized controlled trial
Barton S. Which clinical studies provide the best evidence: the best RCT still trumps the best observational study. BMJ. 2000;321(7256):255–6.
Akobeng AK. Understanding randomized controlled trials. Arch Dis Child. 2005;90:840–4.
Evans D. Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions. J Clin Nurs. 2003;12:77–84.
Rothwell PM. External validity of randomized controlled trials: “to whom do the results of this trial apply?”. Lancet. 2005;365:82–93.
Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10:37–10.
Ware JH, Hamel MB. Pragmatic trials—guides to better patient care? N Engl J Med. 2011;364:1685–7.
Chalkidou K, Tunis S, Whicher D, Fowler R, Zwarenstein M. The role of pragmatic randomized controlled trials (pRCTs) in comparative effectiveness research. Clin Trials. 2012;9:436.
Schwartz D, Lellouch J. Explanatory and pragmatic attitudes in therapeutical trials. J Chronic Dis. 1967;20:637–48.
Godwin M, Ruhland L, Casson I, et al. Pragmatic controlled clinical trials in primary care: the struggle between external and internal validity. BMC Med Res Methodol. 2003;3:28.
Elridge S. Pragmatic trials in primary healthcare: what, when, and how? Fam Pract. 2010;27:591–2.
Mitka M. FDA advisory decision highlights some problems inherent in pragmatic trials. JAMA. 2011;306:1851–2.
Sugarman J, Califf RM. Ethics and regulatory complexities for pragmatic clinical trials. JAMA. 2014;311:2381–2.
Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, Tunis S, Bergel E, Harvey I, Magid DJ, Chalkidou K. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol. 2009;62:464–75.
Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, Tunis S, Bergel E, Harvey I, Magid DJ, Chalkidou K. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. CMAJ. 2009;180:E47–57.
Loudon K, Treweek S, Sullivan F, Donnan P, Thorpe KE, Zwarenstein M. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350:h22147.
Loudon K, Zwarenstein M, Sullivan F, Donnan P, Treweek S. Making clinical trials more relevant: improving and validating the PRECIS tool for matching trial design decisions to trial purpose. BMC Trials. 2013;14:115.
Patsopoulos NA. A pragmatic view of pragmatic trials. Dialogues in Neurosci. 2011;13(2):217–24.
Yoong S, Wolfenden L, Clinton-McHarg T, et al. Exploring the pragmatic and explanatory study design on outcomes of systematic reviews of public health interventions: a case study on obesity prevention trials. J Public Health (Oxf). 2014;36(1):170–6.
Koppenaal T, Linmans J, Knottnerus JA, Spigt M. Pragmatic vs. explanatory: an adaptation of the PRECIS tool helps to judge the applicability of systematic reviews for daily practice. J Clin Epi. 2011;64(10):1095–101.
Tosh G, Soares-Weiser K, Adams CE. Pragmatic vs explanatory trials: the pragmascope tool to help measure differences in protocols of mental health randomized controlled trials. Dialogues Clin Neurosci. 2011;13(2):209–15.
Glasgow RE, Gaglio B, Bennett G, Jerome GJ, Yeh H, Sarwer DB, et al. Applying the PRECIS criteria to describe three effectiveness trials of weight loss in obese patients with comorbid conditions. Health Serv Res. 2012;47(3):1051–67.
Witt CM, Manheimer E, Hammerschlag R, et al. How well do randomized trials inform decision making: systematic review using comparative effectiveness research measures on acupuncture for back pain. PLoS One. 2012;7:e32399.
Jordan AE, Perlman DC, Smith DJ, Reed JR, Hagan H. Use of the PRECIS-II instrument to categorize reports along the efficacy-effectiveness spectrum in an hepatitis C virus care continuum systematic review and meta-analysis. J Clin Epidemiol. 2017; Epub ahead of print.
Louma KA, Leavitt IM, Marrs JC, Nederveld AL, Regensteiner JG, Dunn AL, et al. How can clinical practices pragmatically increase physical activity for patients with type 2 diabetes? A systematic review. Transl Behav Med. 2017;7(4):751–72.
Aves T, Allan KS, Lawson D, Nieuwlaat R, Beyene J, Mbuagbaw L. The role of pragmatism in explaining heterogeneity in meta-analyses of randomized trials: a protocol for a cross-sectional methodological review. BMJ Open. 2017;7(9):e017887.
Baker R, Jackson D. A new approach to outliers in meta-analysis. Health Care Manag Sci. 2008;11(2):121–31.
Beath KJ. A finite mixture method for outlier detection and robustness in meta-analysis. Res Syn Meth. 2014;5:285–93.
Lee KJ, Thompson SG. Flexible parametric models for random effects distributions. Stat Med. 2008;27:418–34.
Waters E, de Silva-Sanigorski A, Burford BJ, et al. Interventions for preventing obesity in children. Cochrane Database Syst Rev. 2011;Issue 12. Art. No.:CD001871. DOI: 10.1002
Landis JR, Koch G. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977;33(1):363–74.
R Core Team. A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2013.
Maclure M. Explaining pragmatic trials to pragmatic policy-makers. CMAJ. 2009;180(10):1001–3.
This research is supported by the O’Brien Institute for Public Health and Hotchkiss Brain Institute at the University of Calgary.
Availability of data and materials
The data that support the findings of this study are available from the Cochrane Systematic Review. PRECIS-2 review ratings obtained from the published trials are obtainable from the authors on request.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sajobi, T.T., Li, G., Awosoga, O. et al. A comparison of meta-analytic methods for synthesizing evidence from explanatory and pragmatic trials. Syst Rev 7, 19 (2018). https://doi.org/10.1186/s13643-017-0668-3