Skip to main content

A comparison of meta-analytic methods for synthesizing evidence from explanatory and pragmatic trials



The pragmatic–explanatory continuum indicator summary version 2 (PRECIS-2) tool has recently been developed to classify randomized clinical trials (RCTs) as pragmatic or explanatory based on their design characteristics. Given that treatment effects in explanatory trials may be greater than those obtained in pragmatic trials, conventional meta-analytic approaches may not accurately account for the heterogeneity among the studies and may result in biased treatment effect estimates. This study investigates if the incorporation of PRECIS-2 classification of published trials can improve the estimation of overall intervention effects in meta-analysis.


Using data from 31 published trials of intervention aimed at reducing obesity in children, we evaluated the utility of incorporating PRECIS-2 ratings of published trials into meta-analysis of intervention effects in clinical trials. Specifically, we compared random-effects meta-analysis, stratified meta-analysis, random-effects meta-regression, and mixture random-effects meta-regression methods for estimating overall pooled intervention effects.


Our analyses revealed that mixture meta-regression models that incorporate PRECIS-2 classification as covariate resulted in a larger pooled effect size (ES) estimate (ES = − 1.01, 95%CI = [− 1.52, − 0.43]) than conventional random-effects meta-analysis (ES = − 0.15, 95%CI = [− 0.23, − 0.08]).


In addition to the original intent of PRECIS-2 tool of aiding researchers in their choice of trial design, PRECIS-2 tool is useful for explaining between study variations in systematic review and meta-analysis of published trials. We recommend that researchers adopt mixture meta-regression methods when synthesizing evidence from explanatory and pragmatic trials.

Peer Review reports


Randomized controlled trials (RCTs) are cited as the highest level of evidence that can inform clinical and policy decisions about the efficacy and/or effectiveness of an intervention [1,2,3]. However, RCTs are generally costly, with many stringent inclusion and exclusion criteria which limit generalizability of results and relevance to routine clinical practice. Consequently, there is increased interest in designing RCTs that show real-world effectiveness of an intervention in broad patient populations [4,5,6,7]. Schwartz and Lellouch [8] proposed a distinction between explanatory trials, which confirm a physiological or clinical hypothesis, and pragmatic trials, which inform a clinical or policy decision by providing evidence for adoption of the intervention into real-world clinical practice. Since their seminal paper, several papers have investigated the strengths and limitations of pragmatic trials [4,5,6,7,8,9,10,11,12]. Thorpe et al. [13, 14] proposed the original PRECIS (pragmatic–explanatory continuum indicator summary) tool that further clarified the concept and features of pragmatism and a scoring system and visual representation of the graphical representation of the pragmatic features of a trial. Loudon et al. [15, 16] later proposed a revision of the PRECIS, called PRECIS-2, a 9-item tool to assess the characteristics of a pragmatic design. Features of the PRECIS-2 tools include the recruitment of investigators and participants, the intervention and its delivery, follow-up, and the determination and analysis of outcomes. Many trials could be deemed to be pragmatic with regard to at least one of these dimensions, but few are truly pragmatic on all dimensions.

A number of studies have explored the use of PRECIS instruments when synthesizing evidence from published trials. For example, Patsopoulos [17] suggests that “systematic reviews and meta-analyses could incorporate a PRECIS score for synthesized trials and help the systematic mapping of the pragmatism in published research”. Yoong et al. [18] investigated the impact of pragmatic–explanatory study design characterization on conclusions of systematic reviews of public health interventions of obesity trials. They observed that there were no differences among the intervention effects across classifications of the synthesized studies based on PRECIS ratings. Koppenaal et al. [19] applied a modified version of PRECIS, called PRECIS review tool, to judge the applicability of studies in systematic reviews for daily clinical practice in two systematic reviews [19]. Tosh et al. [20] proposed the pragmascope, an adapted version of the PRECIS tool that uses a 5-point scale to assess the degree of pragmatism when designing RCTs in mental health. Witt et al. [21] conducted a systematic analysis in trials of acupuncture for lower back pain with the intention of applying the PRECIS tool. Glasgow et al. [22] also used the PRECIS tool to describe the design features of three effectiveness trials investigating weight loss in obese patients with comorbid conditions. More recently, Jordan et al. [23] also demonstrated the potential benefit of using the PRECIS-2 instrument for aiding systematic review and meta-analysis of studies in hepatitis C virus care. Louma et al. [24] used PRECIS-2 to identify interventions that effectively increased physical activity and glycemic controls among patients with type 2 diabetes and assess the potential use of PRECIS-2 for implementing physical activity interventions in clinical practice settings.

While the uptake of PRECIS instruments (i.e., PRECIS and PRECIS-2) in systematic reviews is increasing, these instruments are mostly used descriptively but their impact in explaining heterogeneity in meta-analytic investigations has not been investigated. Given the variations in study designs, differences in study characteristics of explanatory and pragmatic trials are likely to influence both statistical heterogeneity and intervention effect estimates in meta-analyses. Aves et al. [25] argues that “….If heterogeneity is substantial, due to 66the degree of pragmatism, it might not be appropriate to pool data from pragmatic and explanatory trials….” Although modern meta-analytic methods such as mixture meta-regression and robust meta-analytic methods have been developed to pool evidence from heterogeneous populations [26,27,28], there is limited application of these methods and incorporation of PRECIS ratings in synthesizing evidence from explanatory and pragmatic trials.

This study aimed to assess whether the incorporation of PRECIS classification could improve the modeling of heterogeneity among published trials in meta-analytic investigations. Using data from a Cochrane systematic review of 31 trials of community-based obesity intervention in children [29], we compared the performance of random-effects, stratified random-effects, and mixture random-effects meta-regression techniques that accounted for differences between explanatory and pragmatic trials for synthesizing evidence from published trials.


The pragmatic–explanatory continuum indicator summary (PRECIS-2)

PRECIS was developed by a group of international researchers and methodologists to assist trialists in distinguishing between pragmatic and explanatory trial designs [13, 14]. PRECIS requires trialists to indicate on a visual scale (in the shape of wheel) where a trial falls along the pragmatic–explanatory continuum. More recently, a revision of the PRECIS tool, PRECIS-2, was developed [15]. This consists of nine domains, including eligibility, recruitment, setting, organization, flexibility in intervention delivery, flexibility in adherence, follow-up, primary outcome, and primary analysis. Each domain is rated on a 5-point Likert scale from 1 (completely explanatory) to 5 (completely pragmatic) [16].

Systematic review of obesity prevention trials

Data were from the Cochrane systematic review of trials that investigated the efficacy or effectiveness of community-based obesity prevention interventions in children [29]. The systematic review included all RCTs published between 1990 and March 2010. Similar to the previous work by Yoong et al. [18], we used an adapted version of the PRECIS-2 tool to conduct an audit of all 30 trials of children age 6–12 years included in the Cochrane review of obesity trials to assess the pragmatic–explanatory design features of these studies.

Raters and rating procedures

Before rating the trials in this systematic review, three study co-authors (TTS, OA, MW) first read and discussed relevant papers on PRECIS [13,14,15,16] and piloted their knowledge of the PRECIS tool using 5 randomly selected published trials. Raters then independently rated each of the 31 trials on the 9 domains of the PRECIS-2 tools. Each domain was a score on a 5-point scale that range from 1 (completely explanatory) to 5 (completely pragmatic) using the broad definitions provided by the tool developers [15, 16]. Authors then met to discuss variations in scoring and reached a consensus where there were discrepancies. For each investigator and each trial, an overall summary score was derived by averaging of the ratings of the 9 items. High scores indicated a more pragmatic trial, while lower scores indicated the trial is more explanatory. Since no cut-off scores were provided by the original authors, we applied a scoring method for categorizing the trials as explanatory or pragmatic. Specifically, we classified a trial as explanatory if the average score for the trial is less than 3.0, while a trial is considered pragmatic if the average PRECIS-2 is at least 3.0.

Statistical analyses

Descriptive statistics was used to summarize the average domain-specific and overall PRECIS-2 scores across the 31 studies included in this analysis. Fleiss kappa statistic was used to assess inter-rater reliability among the domain-specific and overall ratings of each trial on the PRECIS-2 scale [30]. Four meta-analytic methods were used to assess the changes in conclusions about overall intervention effect on body mass index of these children. These include (i) the conventional random-effects meta-analysis; (ii) stratified random-effects meta-analysis, in which effect sizes from pragmatic trials and explanatory trials were independently pooled; (iii) random-effects meta-regression that adjusted for PRECIS-2 rating (explanatory vs pragmatic); and (iv) a mixture random-effects meta-regression that adjusted for PRECIS-2 rating (pragmatic vs explanatory). For each model, we report pooled effect size (ES), 95% confidence interval, between-study variance (τ2), and Bayesian information criterion. All analyses, including kappa estimated, were conducted using R software [31].


Of the 31 studies included in our analysis, 12 trials were focused on physical activity interventions only, 5 focused on dietary interventions only, while 14 adopted a combination of physical and dietary interventions [29]. As reported in the Cochrane review [29], the reported standardized mean difference in body mass index between the intervention to reduce obesity in children and the controls for these 31 trials ranged between − 0.36 and 0.45 (Fig. 1). The inter-rater reliability among our independent raters, as measured by the Fleiss kappa, ranged between 0.41 and 0.86, indicating moderate to substantial agreement across the domains [30]. We achieved moderate to substantial agreement for all the domains, with flexibility (delivery) and follow-up domains showing lower agreement (κ = 0.41 and 0.48, respectively). The average overall ratings ranged between 2.44 and 4.56. Using a cut-off of 3.0, five studies were classified as explanatory while the remaining 26 studies were classified as pragmatic (see Table 1 and Fig. 2).

Fig. 1
figure 1

Forest plot obesity trials of children aged 6–12

Table 1 PRECIS-2 ratings and characteristics of 31 published trials of interventions to reduce obesity in children aged 6–12 years
Fig. 2
figure 2

Average PRECIS-2 wheel domain scores for pragmatic and explanatory trial classification of 31 obesity trials in children aged 6–12 years

Table 2 and Fig. 3 describe the estimates of pooled intervention effect size based on random-effects meta-regression methods for 31 published trials. Conventional pooling of the intervention effects across all the studies based on random-effects meta-analysis suggest statistically significant pooled ES of − 0.15 (95%CI = [− 0.23, − 0.08]). When we fitted stratified meta-analysis independently for pragmatic and explanatory trials, the meta-analysis of the explanatory trials revealed no significant pooled intervention effect for the explanatory trials (ES = − 0.32; 95%CI = [− 0.88, 0.33]) but statistically significant pooled intervention effect for the pragmatic trials (ES = − 0.12; 95%CI = [− 0.19, − 0.06]). Nevertheless, the pooled intervention effect from the explanatory trials was about 2.5 times the pooled effect size obtained from the pragmatic trials. Meta-regression methods that adjusted for overall PRECIS-2 ratings revealed significantly larger pooled effect sizes and smaller τ2 than the conventional overall pooled effect size. Specifically, the random-effects meta regression model that adjusted for PRECIS-2 rating revealed a statistically significant pooled effect size (ES = − 0.79, 95%CI = [− 1.26, − 0.31] that is more than five times larger than the estimated pooled effect size obtained from the conventional meta-analysis model. The mixture random-effects meta-regression model that controlled for PRECIS rating (pragmatic vs explanatory) even revealed substantially large and statistically significant pooled effect size (ES = − 1.05, 95%CI = [− 1.53, − 0.54]) (Fig. 3).

Table 2 Comparison of meta-analytic methods for estimating overall intervention effect in trials to reduce obesity in children aged 6–12 years
Fig. 3
figure 3

Comparison of meta-analytic methods for estimating overall intervention effect in trials for reducing obesity in children aged 6–12 years


The incorporation of PRECIS-2 classification of the trials as covariate in meta-regression models results in larger estimates of pooled intervention effects in published trials than previously reported pooled effect sizes [19, 29]. Meta-regression methods, such as mixture regression models that controls for PRECIS-2 rating as a covariate, are particularly advantageous in that they can account for heterogeneity among the studies by modeling the heterogeneity attributable to pragmatism using mixture distribution. This finding supports results from previous research that recognize the importance of accounting for between-study heterogeneity that is attributable to the degree of pragmatism in systematic review of published trials [18, 27]. It highlights the utility of PRECIS-2 tool for aiding synthesis of evidence from published trials and the impact of information that design characteristics can have in explaining the between-study heterogeneity when conducting meta-analysis of published studies. We recommend that researchers should not only use PRECIS-2 rating information descriptively in meta-analysis of published studies but also for inferential purposes for modeling heterogeneity when estimating pooled intervention effects.

Also, we found that stratified meta-analytic methods, in which explanatory and pragmatic trials are defined with explicit criteria and independently synthesized, supported the notion that there was an overall significant effect of the community-based obesity intervention in children in pragmatic trials but not in explanatory trials. The estimated pooled effect size obtained from synthesis of pragmatic trials was significantly smaller than the estimated pooled effect size obtained from synthesis of explanatory trials. This finding is in line with previous research that shows explanatory trials often report larger effect sizes than pragmatic trials [18]. One main advantage of stratified meta-regression methodology is that it can help researchers and policy makers understand the strength of evidence for efficacy and/or effectiveness of an intervention in a population. It can also aid policy decision making about an intervention when the direction of pooled effect size in both pragmatic and explanatory trials are in the same direction. However, policy decision making about an intervention based on this stratified meta-analysis approach may not always be straightforward especially when the estimated pooled intervention effect sizes in both explanatory and pragmatic trials are in opposite directions. Few studies have recommended that policy decision making should be based on evidence from pragmatic trials only since they confirm the real-world effectiveness of an intervention [23, 24, 32]. But this recommendation may not be valid when there is limited number of pragmatic trials included in the systematic review due to low statistical power.

Our comparisons of the meta-analytic methods for synthesizing evidence from published trials rely on observed data only. Future research will explore the use of Monte Carlo methods to examine the statistical properties of these methods including their statistical power, bias, mean square error, and coverage under a variety of data analytic conditions. Importantly, the accuracy of the pooled intervention effects obtained from meta-regression analysis hinges on the accuracy of the PRECIS-2 ratings to assess the degree of pragmatism in each trial. While the ratings obtained from the three reviewers in our study had good overall inter-rater agreement, flexibility delivery and follow-up domains of PRECIS-2 exhibited only moderate agreement. This is consistent with previous studies that report high variability or poor agreement on flexibility and/or follow-up domains [18, 19, 23, 24]. Additionally, we had missing scores on some of the PRECIS domains for some studies because the original articles lack this information. Although Yoong et al. [18] recommend that investigators should endeavor to contact primary authors of each published trial with incomplete information when using PRECIS-2, we did not contact authors of the original articles but derived the mean scores on PRECIS domains and overall score based on all the available scores [18, 19]. Future research will use sensitivity analysis to assess the impact of missing data on estimates of pooled effect sizes from meta-analytic investigations. Moreover, while we have analyzed the overall PRECIS-2 scores for these published trials, the component domains that constitute this overall score may have ratings that vary on the explanatory–pragmatic continuum. The PRECIS domain-specific information about these trials might provide policy makers with relevant information (e.g., about implementation of interventions).


This study shows that the incorporation of information about the type of trials (explanatory or pragmatic), assessed using PRECIS-2 tool, can influence the estimation of pooled intervention effects in meta-analysis of published trials when there is substantial heterogeneity attributable to pragmatism. Secondly, this study also reveals the need for meta-regression methods that, adjusts for PRECIS information as covariate, for estimating pooled intervention effects in meta-analytic investigations. This ensures that valid conclusions are derived from systematic reviews and meta-analysis of published trials. We recommend that meta-analytic investigations in systematic reviews should incorporate information about design characteristics using PRECIS-2 when synthesizing evidence from published studies.



95% confidence interval


Effect size


Pragmatic–explanatory continuum indicator summary version 2


Randomized controlled trial


Standard deviation


  1. Barton S. Which clinical studies provide the best evidence: the best RCT still trumps the best observational study. BMJ. 2000;321(7256):255–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Akobeng AK. Understanding randomized controlled trials. Arch Dis Child. 2005;90:840–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Evans D. Hierarchy of evidence: a framework for ranking evidence evaluating healthcare interventions. J Clin Nurs. 2003;12:77–84.

    Article  PubMed  Google Scholar 

  4. Rothwell PM. External validity of randomized controlled trials: “to whom do the results of this trial apply?”. Lancet. 2005;365:82–93.

    Article  PubMed  Google Scholar 

  5. Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10:37–10.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Ware JH, Hamel MB. Pragmatic trials—guides to better patient care? N Engl J Med. 2011;364:1685–7.

    Article  CAS  PubMed  Google Scholar 

  7. Chalkidou K, Tunis S, Whicher D, Fowler R, Zwarenstein M. The role of pragmatic randomized controlled trials (pRCTs) in comparative effectiveness research. Clin Trials. 2012;9:436.

    Article  PubMed  Google Scholar 

  8. Schwartz D, Lellouch J. Explanatory and pragmatic attitudes in therapeutical trials. J Chronic Dis. 1967;20:637–48.

    Article  CAS  PubMed  Google Scholar 

  9. Godwin M, Ruhland L, Casson I, et al. Pragmatic controlled clinical trials in primary care: the struggle between external and internal validity. BMC Med Res Methodol. 2003;3:28.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Elridge S. Pragmatic trials in primary healthcare: what, when, and how? Fam Pract. 2010;27:591–2.

    Article  Google Scholar 

  11. Mitka M. FDA advisory decision highlights some problems inherent in pragmatic trials. JAMA. 2011;306:1851–2.

    Article  CAS  PubMed  Google Scholar 

  12. Sugarman J, Califf RM. Ethics and regulatory complexities for pragmatic clinical trials. JAMA. 2014;311:2381–2.

    Article  CAS  PubMed  Google Scholar 

  13. Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, Tunis S, Bergel E, Harvey I, Magid DJ, Chalkidou K. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J Clin Epidemiol. 2009;62:464–75.

    Article  PubMed  Google Scholar 

  14. Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, Tunis S, Bergel E, Harvey I, Magid DJ, Chalkidou K. A pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. CMAJ. 2009;180:E47–57.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Loudon K, Treweek S, Sullivan F, Donnan P, Thorpe KE, Zwarenstein M. The PRECIS-2 tool: designing trials that are fit for purpose. BMJ. 2015;350:h22147.

    Article  Google Scholar 

  16. Loudon K, Zwarenstein M, Sullivan F, Donnan P, Treweek S. Making clinical trials more relevant: improving and validating the PRECIS tool for matching trial design decisions to trial purpose. BMC Trials. 2013;14:115.

    Article  Google Scholar 

  17. Patsopoulos NA. A pragmatic view of pragmatic trials. Dialogues in Neurosci. 2011;13(2):217–24.

    Google Scholar 

  18. Yoong S, Wolfenden L, Clinton-McHarg T, et al. Exploring the pragmatic and explanatory study design on outcomes of systematic reviews of public health interventions: a case study on obesity prevention trials. J Public Health (Oxf). 2014;36(1):170–6.

    Article  Google Scholar 

  19. Koppenaal T, Linmans J, Knottnerus JA, Spigt M. Pragmatic vs. explanatory: an adaptation of the PRECIS tool helps to judge the applicability of systematic reviews for daily practice. J Clin Epi. 2011;64(10):1095–101.

    Article  Google Scholar 

  20. Tosh G, Soares-Weiser K, Adams CE. Pragmatic vs explanatory trials: the pragmascope tool to help measure differences in protocols of mental health randomized controlled trials. Dialogues Clin Neurosci. 2011;13(2):209–15.

    PubMed  PubMed Central  Google Scholar 

  21. Glasgow RE, Gaglio B, Bennett G, Jerome GJ, Yeh H, Sarwer DB, et al. Applying the PRECIS criteria to describe three effectiveness trials of weight loss in obese patients with comorbid conditions. Health Serv Res. 2012;47(3):1051–67.

    Article  PubMed  Google Scholar 

  22. Witt CM, Manheimer E, Hammerschlag R, et al. How well do randomized trials inform decision making: systematic review using comparative effectiveness research measures on acupuncture for back pain. PLoS One. 2012;7:e32399.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Jordan AE, Perlman DC, Smith DJ, Reed JR, Hagan H. Use of the PRECIS-II instrument to categorize reports along the efficacy-effectiveness spectrum in an hepatitis C virus care continuum systematic review and meta-analysis. J Clin Epidemiol. 2017; Epub ahead of print.

  24. Louma KA, Leavitt IM, Marrs JC, Nederveld AL, Regensteiner JG, Dunn AL, et al. How can clinical practices pragmatically increase physical activity for patients with type 2 diabetes? A systematic review. Transl Behav Med. 2017;7(4):751–72.

    Article  Google Scholar 

  25. Aves T, Allan KS, Lawson D, Nieuwlaat R, Beyene J, Mbuagbaw L. The role of pragmatism in explaining heterogeneity in meta-analyses of randomized trials: a protocol for a cross-sectional methodological review. BMJ Open. 2017;7(9):e017887.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Baker R, Jackson D. A new approach to outliers in meta-analysis. Health Care Manag Sci. 2008;11(2):121–31.

    Article  PubMed  Google Scholar 

  27. Beath KJ. A finite mixture method for outlier detection and robustness in meta-analysis. Res Syn Meth. 2014;5:285–93.

    Article  Google Scholar 

  28. Lee KJ, Thompson SG. Flexible parametric models for random effects distributions. Stat Med. 2008;27:418–34.

    Article  PubMed  Google Scholar 

  29. Waters E, de Silva-Sanigorski A, Burford BJ, et al. Interventions for preventing obesity in children. Cochrane Database Syst Rev. 2011;Issue 12. Art. No.:CD001871. DOI: 10.1002

    Google Scholar 

  30. Landis JR, Koch G. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977;33(1):363–74.

    Article  CAS  PubMed  Google Scholar 

  31. R Core Team. A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2013.

    Google Scholar 

  32. Maclure M. Explaining pragmatic trials to pragmatic policy-makers. CMAJ. 2009;180(10):1001–3.

    Article  PubMed  PubMed Central  Google Scholar 

Download references


This research is supported by the O’Brien Institute for Public Health and Hotchkiss Brain Institute at the University of Calgary.

Availability of data and materials

The data that support the findings of this study are available from the Cochrane Systematic Review. PRECIS-2 review ratings obtained from the published trials are obtainable from the authors on request.

Author information

Authors and Affiliations



TTS and LT conceptualized and designed this study. OA, MW, and TTS participated in the review and ratings of the published trials. TTS and OA performed all statistical analyses. All authors read, revised, and approved the final manuscript.

Corresponding author

Correspondence to Tolulope T. Sajobi.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sajobi, T.T., Li, G., Awosoga, O. et al. A comparison of meta-analytic methods for synthesizing evidence from explanatory and pragmatic trials. Syst Rev 7, 19 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: