 Methodology
 Open Access
 Published:
Many continuous variables should be analyzed using the relative scale: a case study of β_{2}agonists for preventing exerciseinduced bronchoconstriction
Systematic Reviews volume 8, Article number: 282 (2019)
Abstract
Background
The relative scale adjusts for baseline variability and therefore may lead to findings that can be generalized more widely. It is routinely used for the analysis of binary outcomes but only rarely for continuous outcomes. Our objective was to compare relative vs absolute scale pooled outcomes using data from a recently published Cochrane systematic review that reported only absolute effects of inhaled β_{2}agonists on exerciseinduced decline in forcedexpiratory volumes in 1 s (FEV_{1}).
Methods
From the Cochrane review, we selected placebocontrolled crossover studies that reported individual participant data (IPD). Reversal in FEV_{1} decline after exercise was modeled as a mean uniform percentage point (pp) change (absolute effect) or average percent change (relative effect) using either interceptonly or slopeonly, respectively, linear mixedeffect models. We also calculated the pooled relative effect estimates using standard randomeffects, inversevarianceweighting metaanalysis using studylevel mean effects.
Results
Fourteen studies with 187 participants were identified for the IPD analysis. On the absolute scale, β_{2}agonists decreased the exerciseinduced FEV_{1} decline by 28 pp., and on the relative scale, they decreased the FEV_{1} decline by 90%. The fit of the statistical model was significantly better with the relative 90% estimate compared with the absolute 28 pp. estimate. Furthermore, the median residuals (5.8 vs. 10.8 pp) were substantially smaller in the relative effect model than in the absolute effect model. Using standard studylevel metaanalysis of the same 14 studies, β_{2}agonists reduced exerciseinduced FEV_{1} decline on the relative scale by a similar amount: 83% or 90%, depending on the method of calculating the relative effect.
Conclusions
Compared with the absolute scale, the relative scale captures more effectively the variation in the effects of β_{2}agonists on exerciseinduced FEV_{1}declines. The absolute scale has been used in the analysis of FEV_{1} changes and may have led to suboptimal statistical analysis in some cases. The choice between the absolute and relative scale should be determined based on biological reasoning and empirical testing to identify the scale that leads to lower heterogeneity.
Background
The relative scale has been used for decades for estimating the effects on binary outcomes, such as calculating that heavy alcohol consumption increases the occurrence of liver cirrhosis by the rate ratio (RR) of over 10 [1]. It is also standard for survival analysis, using hazard ratios, and for comparisons of incidence rates, using incidence rate ratios. Metaanalyses have shown that the relative scale leads to less heterogeneity in the analysis of binary outcomes compared with the absolute scale (i.e. rate differences), which indicates that the relative scale better captures the biological effects [2]. In contrast, the relative scale has rarely been used in the metaanalysis of continuous outcomes and it is not available as an option in popular metaanalysis software such as the RevMan program of the Cochrane collaboration [3]. Instead, metaanalyses of continuous outcomes typically use the absolute scale, i.e., the original measurement units (mean difference, MD), or the standardized mean difference (SMD) scale in which the mean difference is expressed in the pooled standard deviation units. Both of these approaches (MD and SMD) are available as options in popular metaanalysis software [3].
The selection of scale for continuous outcomes is relevant in the analysis of a single trial and in the metaanalysis of several trials. In a single trial, the scale influences the interpretation of the findings and the communication between researchers, clinicians and patients [4]. In the case of a metaanalysis, the scale additionally influences the comparability of the trials, namely, the relative scale adjusts for the baseline variability in continuous outcomes in the same sense as the pooled RR adjusts for the baseline variability in risk between different studies in the analysis of binary outcomes. In metaanalyses that pooled diverse research topics of continuous outcomes, heterogeneity was less on the relative scale, than on the absolute scale [5–7]. This suggests that the relative scale may better capture also many biological effects that are measured using continuous outcomes. As one illustration, the relative scale was demonstrated to be more informative in the analysis of disease duration compared with using the MD scale [8–10].
The current study was motivated by the Cochrane review by Bonini et al., which examined the effects of β_{2}agonists on exerciseinduced bronchoconstriction (EIB) [11]. The usual limit for classifying that a person has the condition EIB is a ≥ 10% decline in forced expiratory volume in 1 s (FEV_{1}) in a standardized exercise test [12]. Based on 72 comparisons from 44 studies, Bonini et al. calculated that β_{2}agonists reduced the exerciseinduced FEV_{1} decline by 17.67 percentage points (pp) (95% CI: 15.84 to 19.51 pp) [11]. However, one person may suffer from an 11% decline in FEV_{1} by exercise and another person may suffer from an 80% decline in FEV_{1}, yet both of them are similarly classified as cases of EIB. The Cochrane review implies that the expected effect of 17.67 pp. reduction in exerciseinduced FEV_{1} decline applies for both persons. However, it seems likely that the former person has an effect of β_{2}agonist much less, whereas the latter person might have an effect much greater than the overall mean of 17.67 pp. reduction in FEV_{1} decline.
The β_{2}agonists were invented in the middle of the 1900s and their efficacy against EIB was demonstrated in numerous clinical trials starting from the 1970s [12–16]. Thus, it is not relevant to ask the null hypothesis type of question whether β_{2}agonists differ from placebo in their influences on EIB. Instead, the important question is to estimate the average size of the effect and the variation in effect size between individuals.
The goal of this study was to compare the usefulness of the relative and the absolute scales in the estimation of the effects of β_{2}agonists on exerciseinduced FEV_{1} decline. If the relative scale better captures the effects of interventions on FEV_{1} changes, then the metaanalyses that have used an absolute scale such as MD for analyzing the effects on FEV_{1} changes [11] may have led to suboptimal estimates.
Methods
Selection of the β_{2}agonist trials on EIB
No new literature search was done for this analysis, since Bonini et al. [11] carried out recently a thorough search of the literature on controlled trials of β_{2}agonists for EIB.
For the independent participant data (IPD) analysis, we systematically reviewed all the included and excluded studies, and their reference lists in the studies identified by Bonini et al. [11], and included all placebocontrolled inhalatory β_{2}agonist crossover randomized trials that reported IPD, 13 trials [17–29]. Bonini et al. excluded trials for a few reasons, one being “no clear diagnosis of exerciseinduced bronchoconstriction”. We did not exclude such trials for the following reasons: clear dichotomous definition of EIB, such as a ≥ 10% or a ≥ 15% FEV_{1} decline in an exercise test [12] is relevant in certain contexts such as in top level athletics; however, such a cutoff level is arbitrary and has no biological basis, and dichotomization of continuous variables decreases statistical power. Moreover, if participants with small FEV_{1} declines are included in the analysis, the range of FEV_{1} declines becomes wider and the comparison of the absolute scale (intercept) and the relative scale (slope) becomes statistically more powerful. One trial that was excluded by Bonini et al. on the basis that there was no clear diagnosis of EIB reported IPD for exerciseinduced FEV_{1} decline and was included in our IPD analysis [22]. Another trial with IPD data was identified through perusal of the reference lists in included RCTs, and was included in our analysis [30], but had not been identified by Bonini [11]. Thus, a total of 14 trials reporting IPD suitable for this analysis were identified (Table 1).
For the studylevel linear mixedeffect models, we included all the 44 crossover trials that were included in the Analysis 1.1 of Bonini [11]. The characteristics of the 44 trials were described previously [11], and are not summarized in this report. For the standard studylevel metaanalyses, we included the 14 trials that reported IPD.
Extraction of data
When several β_{2}agonists were investigated in the same report, we selected salbutamol if that was used; and if not, salmeterol, in an attempt to decrease the heterogeneity of the comparisons. When exercise tests were repeated several times after the administration of a β_{2}agonist, we selected the shortest delay between the β_{2}agonist administration and the exercise test. In some trials, β_{2}agonists were administered for several days or weeks, and we selected the shortest administration before the exercise test. There have been discussions about whether the most appropriate baseline in an EIB study is before drug administration (predrug) or after drug administration (postdrug) [31, 32]. In cases when both levels were available, we selected the predrug level as the baseline.
For the IPD analysis, individual exerciseinduced FEV_{1} declines were extracted from the 14 trial reports. Two studies reported IPD results as figures and the FEV_{1} declines were measured from them [25, 26]. For the studylevel mixedeffect model analysis of the 44 trials, we extracted the numerical values for FEV_{1} declines from the reports, or measured the mean FEV_{1} declines from published figures when numerical data were not published. See Additional file 1: Table S1 and Table S2 for description of the details in the selections of the IPD and study means, and see Additional file 2 for the extracted data. Some inaccuracies and errors in data extraction in Bonini et al. [11] were identified and corrected if required, see Additional file 1: Table S4.
Statistical methods
The absolute effect of a β_{2}agonist for a single participant was measured as the percentage point (pp) difference in the maximal exerciseinduced FEV_{1} percent decline after β_{2}agonist treatment minus the maximal exerciseinduced FEV_{1} percent decline after placebo treatment, see Fig. 1 as an illustration.
At the individual level, the relative effect was measured as the percent of the exerciseinduced FEV_{1} percent decline prevented by β_{2}agonist treatment. It is calculated as the absolute effect divided by the maximal exerciseinduced FEV_{1} percent decline after placebo (Fig. 1).
The usefulness of the absolute scale (intercept) and the relative scale (slope) in explaining the variation in β_{2}agonist effects on FEV_{1} decline after exercise was analyzed with linear mixedeffects models. The type of β_{2}agonist and the identity of the trial were used as clustering variables for the participants. The lmer function of the lme4 package of the statistical software R was used for the mixedeffects modeling [33, 34].
First, the intercept, corresponding to the mean effect of a β_{2}agonist on the absolute scale, was included in the statistical model of the IPD, see formula (1) below. Thereafter, the slope was added, which explains the variation in the β_{2}agonist absolute effects on FEV_{1} decline by the variation in FEV_{1} declines after placebo administration (i.e. untreated FEV_{1} decline), formula (2) below. Finally, the intercept was removed, so that the slope remaining alone describes the mean effect of β_{2}agonists on the relative scale, formula (3) below. The models were compared with the anova test and Akaike Information Criterion (AIC). For the printouts of the calculations, see Additional file 1.
Definitions (see Fig. 1):
X = FEV_{1} decline in the placebo test,
Y = absolute difference in FEV_{1} declines in the β_{2}agonist and placebo tests
The median and interquartile levels for the relative effects of β_{2}agonists were calculated with the rq function of the quantreg package in R without adding an intercept [33, 35].
The studylevel mixedeffects models were carried out with the lmer function with the type of β_{2}agonist being the clustering variable for studies and by using the number of participants as the weight for the study means. Similarly, as in the analysis of the IPD data, first the intercept was included, then the slope was added, and finally the intercept was removed.
Standard metaanalysis comparing the relative and absolute scales was performed using the generic inverse variance and random effects options of the RevMan program [3]. The metaanalyses were restricted to the 14 trials with the IPD, since the absolute mean effect on the FEV_{1} decline and the standard error (SE) for the difference could be calculated accurately from the individual paired differences from the IPD.
For the standard metaanalysis, the relative effects of individual trials were calculated in two ways, see Additional file 1: Table S3. First, the absolute mean effect and its SE (see above) were divided by the exerciseinduced FEV_{1} decline after the placebo. If there was no absolute mean effect, this leads to a 0% relative effect; and if all FEV_{1} decline was fully prevented, this leads to a 100% effect. Second, the relative effect and its SE were derived from the slopes of the linear regression models of the 14 trials, see Additional file 1. We used the χ^{2} test and the I^{2} statistic to assess statistical heterogeneity among the trials in each metaanalysis [36]. A value of I^{2} greater than about 70% indicates a high level of heterogeneity.
To estimate the potential role of the regression to the mean phenomenon as a cause for the slope between the effect of β_{2}agonists and the placebotest FEV_{1} decline in Fig. 2, we used three approaches. The slope generated by regression to the mean depends on the withinsubject SD of the placebotest FEV_{1} decline and the β_{2}agonisttest FEV_{1} decline, and on the betweensubject SD of the placebotest FEV_{1} decline [37]. Thus, regression to mean is independent of the size of treatment effect. Therefore, we first estimated the slope generated by regression to the mean by comparing two different placebotest FEV_{1} declines of 45 participants of four studies [20, 23, 24, 30]. Second, we used the Blomqvist formula to calculate the corrected slope [37]. Third, we calculated from formula (1) in Hayes paper [37] the magnitude of withinsubject SD that would be needed to generate the observed slope by the regression to mean phenomenon.
Twotailed Pvalues were used.
Results
Analyses of the IPD
Fourteen placebocontrolled crossover comparisons were identified that reported the IPD of the effects of β_{2}agonists on exerciseinduced FEV_{1} decline (Table 1). In all they included 187 participants. Six trials with 89 participants used salbutamol (albuterol) [17–22], 3 trials with 36 participants used salmeterol [23–25], 2 trials with 24 participants terbutaline [26, 27], 1 trial with 16 participants reproterol [28], 1 trial with 12 participants bitolterol [29] and 1 trial with 10 participants used metaproterenol (orciprenaline) [30].
The included trials administered placebo and β_{2}agonist to the same participants in a crossover design. The exercise challenge was carried out thereafter and the postexercise decline in FEV_{1} was measured. Figure 1 demonstrates the calculation of the absolute effect of β_{2}agonist for participant X, who is also indicated in Fig. 2.
The level of exerciseinduced FEV_{1} decline in the placebo test is shown on the horizontal axis of Fig. 2 for each individual participant in the 14 IPD studies. After placebo, the FEV_{1} changes caused by exercise ranged from an 82% decrease to a 2% increase, with the median of a 31% decrease. The absolute effect of the β_{2}agonist for each individual was calculated as the percentage point (pp) difference in the FEV_{1} decline after the β_{2}agonist and after the placebo. For example, participant X on the lefthand side of Fig. 2 had an exerciseinduced FEV_{1} decline of 70% after placebo, and a decline of 29% after salbutamol, which indicates a 41 percentage point (pp) improvement (based on 70–29), as the effect of salbutamol on the absolute scale, see also Fig. 1. On the relative scale, the FEV_{1} decline of the same person was reduced by 58%, based on the ratio of 41/70.
On the relative scale, the 0% effect indicates that the β_{2}agonist has no effect, i.e., the FEV_{1} decline after β_{2}agonist is identical to the FEV_{1} decline after placebo. The 100% effect indicates full protection so that exercise after β_{2}agonist causes no decline in FEV_{1}, i.e., the FEV_{1} decline occurring after placebo is fully reversed by the β_{2}agonist. These two limits are shown in Fig. 2 by the dashed lines. Ten participants showed β_{2}agonist effects below 0% which means that exerciseinduced FEV_{1} decline in the β_{2}agonist test was greater than in the placebo test. Probably this is explained by random variation. Sixtyfour participants showed β_{2}agonist effects over 100% which means that FEV_{1} level after exercise in the β_{2}agonist test was greater than the FEV_{1} level before exercise. In addition to random variation, this finding is also explained by our usage of the predrug level as the baseline. For many participants β_{2}agonist increased preexercise FEV_{1} level and if exerciseinduced FEV_{1} decline is simultaneously prevented, this would lead to effects above 100% in the calculation of the relative effects.
The distribution of the data points in Fig. 2 suggests that the absolute effect of β_{2}agonist on FEV_{1} decline appears to be greater in study participants with larger baseline exerciseinduced FEV_{1} declines after placebo on the lefthand side of the graph. This is tested explicitly below by initially fitting the data using only a single intercept which is equivalent to describing the effect of β_{2}agonist on exerciseinduced FEV_{1} decline as a single average percentage point improvement akin to an absolute scale approach used by Bonini et al. [11]. Thereafter we fit the data using a slope to derive an average proportion of exerciseinduced FEV_{1} decline that is reversed by β_{2}agonist treatment which accounts for differing baseline exerciseinduced FEV_{1} declines akin to a relative scale approach.
The usefulness of the absolute scale (intercept) and the relative scale (slope) were compared with linear mixedeffects models shown in Table 2. When only the intercept was included, the β_{2}agonists reduced the FEV_{1} decline by an average of 28 pp. This intercept indicates the mean effect on the absolute scale. When the slope was added, the fit of the model was improved substantially. Furthermore, addition of the slope caused the estimate of the intercept to decrease substantially. When the intercept was removed from the model, the change in the fit of the model was much smaller compared with the addition of the slope. In addition, the lower AIC value also indicates that the model with the slope alone is better than the model with the intercept alone (Table 2). The slope of − 0.90 indicates that β_{2}agonist treatment reduced the exerciseinduced FEV_{1} decline by 90% on the relative scale. The absolute effect (intercept alone) and the relative effect (slope alone) obtained from the mixedeffects models are shown in Fig. 2 as continuous solid lines.
A further measure to compare the relative scale (slope) and the absolute scale (intercept) was the magnitude of the residuals of the models. For the absolute effect of the uniform 28 pp. decrease in FEV_{1} decline, the median residual was 10.8 pp. For the relative effect of the 90% decrease in FEV_{1} decline, the median residual was just 5.8 pp. This also illustrates that the relative effect (slope) captures much better the individuallevel variation in the effects of β_{2}agonists than the absolute effect (intercept).
Variation in the effects of the β_{2}agonists was also analyzed by stratifying participants to categories by the postexercise FEV_{1} declines after placebo administration. Over the 5 categories shown in Table 3, there is a 3fold variation in the absolute mean effect of the β_{2}agonists between the extremes ranging from 15.2 pp. in the category with the lowest FEV_{1} decline after placebo administration to 44.3 pp. in the category with the highest FEV_{1} decline after placebo administration. The confidence intervals of the first, fourth and fifth groups are inconsistent with the overall mean absolute effect of 28 pp. decrease in FEV_{1} decline (Fig. 2). However, the confidence intervals of the relative effects of all the 5 categories are overlapping and consistent with the confidence interval of the overall 90% mean effect calculated from the slope of the linear regression model (Fig. 2, Table 2).
The distribution of the individuallevel relative effect of β_{2}agonists was skewed with skewness of − 1.05 (Fig. 3). Therefore, the median effect might be a more informative descriptive measure of typical effect than the mean. The median relative effect over the 187 participants was an 88% reduction in FEV_{1} decline, with the interquartile range from 60% to 103%. Although the median is close to the mean estimate (90%), the asymmetry is apparent in the interquartile range.
Estimation of the possible role of the regression to the mean
Regression to the mean is a potential confounder in the analysis of change by baseline values [37]. We used three approaches to evaluate the possible bias in the slope in Fig. 2 caused by regression to the mean.
First, four studies with a total of 45 participants carried out two separate placeboexercise tests and they can be used to estimate the size of slope caused by regression to the mean when there is no treatment effect, and a slope of − 0.153 was observed, which is substantially smaller than the slope of − 0.691 for the model with slope and intercept (Table 2).
Second, the withinsubject SD for the placebotest FEV_{1} decline from the four studies was 6.23 pp. and the observed betweensubject SD for all 187 participants was 18.9 pp. The Blomqvist formula can be used to estimate the true slope from the previous SD values [37], and the estimated true slope in Fig. 2 is − 0.653, which is minimally different from our calculated slope of − 0.691 (95% CI: − 0.477 to − 0.910) for the model with slope and intercept (Table 2). In contrast, applying the Blomqvist formula to the slope of − 0.153 of the placeboplacebo comparison of the previous paragraph, the true slope becomes − 0.049 which is very close to the null slope as expected.
Third, we calculated that to generate a slope of − 0.69, the withinsubject SD for the measurement of FEV_{1} decline should be up to 28 pp., which is over 4 times the observed withinsubject SD (ie 6.23 pp).
Thus, on the basis of these three approaches, the size of the regression to the mean phenomenon is so small that it has no practical relevance in our analysis of the IPD in Fig. 2.
Analysis of studylevel data by the mixedeffects models
The studylevel mixedeffects model was focused on the 44 crossover trials [11] which reported the mean exerciseinduced FEV_{1} decline after β_{2}agonist and after placebo (Fig. 4). The study mean FEV_{1} declines after the placebo ranged from − 46% to − 9% with a median decline of − 27%. The range of the data points is much narrower compared with the range of the IPD (Fig. 2), resulting in the studylevel analysis having less statistical power to compare the absolute and relative scales. Unlike the data points in Fig. 2, given the narrower range of FEV_{1} declines after placebo along the xaxis in Fig. 4, it is less clear whether the effect of β_{2}agonist on FEV_{1} decline is greater in studies with larger mean baseline exerciseinduced FEV_{1} declines after placebo.
Testing this formally, the intercept of the studylevel analysis indicated a mean uniform 21 pp. reduction in exerciseinduced FEV_{1} decline by β_{2}agonists (Fig. 4). The slope indicates a 77% reduction in exerciseinduced FEV_{1} decline by β_{2}agonists. Analysis of the studylevel data suggests that the intercept might be more consistent with the data (Table 4).
Standard metaanalysis of studylevel data
Standard metaanalyses comparing the absolute and two relative scales were limited to the 14 trials which reported IPD as these allowed calculation of individual paired differences and their SE values. The calculation of the 95% CIs by the absolute scale and the two relative scales is illustrated in Additional file 1: Table S3.
There is substantial heterogeneity between the trials on the absolute scale with I^{2} = 81% (Fig. 5a). The estimate of a uniform 25 pp. mean reduction is similar to the absolute scale estimate from the mixedeffects model using IPD (Table 2).
The studylevel relative effects were first obtained by dividing the absolute effect by the exerciseinduced FEV_{1} decline in the placebo test for each study prior to pooling, see Additional file 1: Table S3 for the explanation of this transformation. In this approach the heterogeneity between the studies was also I^{2} = 81% (Fig. 5b). This approach indicates that β_{2}agonists reduced exerciseinduced FEV_{1} decline by 83%. This estimate is similar to the mean effect on the relative scale in the mixedeffects model using IPD (Table 2). There is no substantial difference in the statistical significance between the relative effect calculated in this way (Z = 9.7: Fig. 5b) compared with the pooled absolute effect (Z = 9.5; Fig. 5a).
The relative effects of the IPD studies were also obtained from the slopes of linear regression models similar to the analysis presented in Fig. 2, however, now for each of the individual trials separately (Fig. 5c). For most studies, this approach led to smaller SE values compared to the Fig. 5b analysis. The smaller SE values increased the heterogeneity between the studies to I^{2} = 92%, yet concurrently increased the precision of the pooled estimate substantially (Z = 16.0; Fig. 5c), compared with Fig. 5a and b. This approach indicates that β_{2}agonists reduced FEV_{1} decline by 90%, identical to the slope estimate calculated in Table 2.
Discussion
The goal of this study was to compare whether the absolute or the relative scale yields more consistent estimates of effect, using the example of β_{2}agonist treatment to prevent FEV_{1} declines associated with EIB, the severity of which can range widely between patients. The absolute scale is routinely used in the analysis of continuous data and therefore the comparison of these two scales is relevant more widely than just for the analysis of FEV_{1} changes.
In people with EIB, Bonini et al. calculated that the β_{2}agonists decreased exerciseinduced FEV_{1} decline by 17.67 pp. (95% CI: 15.84 to 19.51 pp) [11]. If EIB was a homogeneous medical condition, such a uniform effect might be meaningful. Instead, EIB is highly heterogeneous, since it is usually defined by postexercise FEV_{1} decline of 10% or more, though other arbitrary cutoff limits have been used. Thus, in this dichotomization two persons with 11% and 80% FEV_{1} declines after exercise are both classified as having EIB, whereas a person with a 9% FEV_{1} decline is not. However, the person who has the 11% decline probably is biologically much closer to the person who has the 9% decline compared with the person who has the 80% FEV_{1} decline after exercise. It does not seem reasonable to assume that Bonini’s estimate of 17.67 pp. effect would apply for people with a low and a high level of exerciseinduced FEV_{1} decline. Furthermore, dichotomization of continuous variables decreases statistical power [38–41].
One approach to achieve more personalized effects of β_{2}agonists is to categorize people into groups by their untreated exerciseinduced FEV_{1} decline levels (Table 3). In people who had untreated exerciseinduced FEV_{1} declines in the range from 10% to 19%, β_{2}agonists reduced the FEV_{1} decline by 15 pp. (95% CI: 10 to 20 pp), whereas in people who had untreated FEV_{1} declines in the range from 30% to 39%, the reduction of the decline was 33 pp. (95% CI: 25 to 41 pp), and in people who had untreated FEV_{1} declines of 40% and greater the percentage point improvement was even greater (Table 3). The confidence intervals of the three groups with FEV_{1} decrease 30% and greater are all inconsistent with the 17.67 pp. effect calculated by Bonini [11]. These three groups contain 61% (97 of 159) of the participants in Table 3. This illustrates that Bonini’s estimate of effect does not apply to a great proportion of people classified as having EIB.
The relative scale is most informative in the analysis of the β_{2}agonist effects on exerciseinduced FEV_{1} declines since on the relative scale a single estimate of effect, expressed as a percentage improvement of the baseline exerciseinduced FEV_{1} decline (rather than a uniform percentage point improvement), applies over all study participants independent of their initial FEV_{1} decline levels (Fig. 2, Tables 2 and 3). In our analysis, half of the participants with IPD had observed β_{2}agonist effect 5.8 pp. or more distant from the mean 90% effect, which also shows that the relative scale better captured the observed β_{2}agonist effect compared with the use of a single uniform 28 percentage point improvement, which had median residual of 10.8 pp.
In our study, the primary comparison of the absolute and the relative scales was based on IPD, since the wide distribution of FEV_{1} declines in the IPD analysis results in greater statistical power to compare intercepts and slopes. We also compared the absolute and relative scales on the basis of studylevel data of 44 trials, but no superiority of the relative scale was seen in that comparison, indeed absolute scale seemed to be slightly better (Table 4). In addition, no superiority of relative scale over the absolute scale was seen in standard metaanalyses (Fig. 5a and b). These discrepancies between the analyses based on IPD (Fig. 2) and on the studylevel data are examples of the “ecological fallacy”. In order to avoid the potential for the ecological fallacy introduced by studylevel analyses, whenever feasible, examination of IPD has been recommended [42–44]. Thus, analysis of the studylevel data alone (Table 4) or the comparison of standard metaanalyses (Fig. 5a and b) would have led to a false conclusion that the absolute scale is better or at least not worse than the relative scale.
Nevertheless, even though the analyses of the studylevel data did not yield valid comparison of the absolute and relative scales, the studylevel estimate calculated from 44 trials for the relative effect was quite similar with the estimate from the IPD analysis of 14 trials: 77% vs. 90% improvement in the exerciseinduced FEV_{1} decline, respectively. This divergence in estimates can be partly explained by the different sets of studies that were compared. The standard studylevel metaanalyses of the 14 studies which had IPD available reached relative effect estimates of 83% and 90% reduction in FEV_{1} decline, depending on the calculation of the SE (Fig. 5), very similar to the overall IPD mixedeffects regression analysis. This latter comparison was based on the same set of studies.
Most popular statistical software such as the RevMan of the Cochrane Collaboration do not have an option to pool continuous outcomes on the relative scale. However, it is available in the metacont function of the R package meta [33, 45, 46]. Nevertheless, a simple approach to pool results of studylevel data on the relative scale when this option is not available in a statistical program is to normalize the results of the studies by dividing the absolute mean effects and their SD values by the placebo group mean outcome value (Table S3). Such a transformation can easily be done with a spreadsheet program and the transformed data can be entered in a standard statistical program for metaanalysis. This approach of calculating the relative effect is illustrated in Fig. 5b. Alternatively, if IPD is available, one can calculate and pool the slopes of linear regression curves for each study, which usually leads to more narrow SE estimates and more accurate pooled estimates as shown in Fig. 5c. However, IPD is rarely available and therefore calculation of the slope is not often feasible. Furthermore, for many crossover trials that reported the studylevel data (Fig. 4), the paired SE was not published and would need to be imputed, but this problem applies to both the absolute and the relative scales.
In metaanalysis of binary outcomes, relative scale analysis using effect measures such as risk ratios or odds ratios leads to asymmetric confidence intervals, because the studies are pooled on the logarithmic scale with symmetric confidence intervals and then transformed back. Similarly, in metaanalysis of continuous outcomes, the findings can be pooled on the logarithmic scale using ratio effect measures, leading to asymmetric CIs [5]. However, relative scale effects for continuous outcomes can also be derived from slopes (Fig. 2), or by the normalization of the results of the studies by dividing the absolute mean effects and their SD values by the placebo group mean outcome value (Table S3) [10], both of which lead to symmetric CIs on the relative scale. Therefore, CIs of the continuous outcomes are not necessarily asymmetric.
The distribution of the relative effects at the individual level is skewed (Fig. 3). Therefore, the median relative effect might appear a more useful descriptive estimate than the mean relative effect. Studylevel metaanalyses cannot find the median effect nor can they describe the distribution of the individuallevel effects such as the interquartile range. Thus, the IPD analysis can give important information additional to the studylevel analyses. In our case, the difference between the mean effect of 90% and the median effect of 88% prevention of EIB is minor. Nevertheless, the great variation in the individuallevel effects indicates that the efficacy of a particular β_{2}agonist in protecting against EIB needs to be assessed at the individual level (Fig. 3).
This study was motivated by Bonini’s metaanalysis on β_{2}agonists for exerciseinduced FEV_{1} declines and their use of the absolute scale in the analysis of study results [11]. However, the absolute scale, either as percentage point differences or as volume differences (measured in Liters), has been used in the analysis of FEV_{1} changes in several other metaanalyses of the Cochrane Library [47–53]. Thus, the superiority of the relative scale is not just an issue relevant to Bonini’s metaanalysis. For example, one of the Cochrane reviews [53] estimated the effect of vitamin C on EIB on the absolute scale and described the effect of vitamin C five minutes after exercise in the Schachter (1982) trial [54] as follows: “No significant difference between vitamin C and placebo: Vitamin C mean: –0.24 (SE ± 0.06) L/s, Placebo mean: –0.44 (SE ± 0.14) L/s, t = 2.13 (P = 0.057)” [53]: Table 2. However, the slope of a linear regression analysis of the Schachter study [54], which had reported the IPD, indicated that vitamin C’s relative decrease in FEV_{1} decline was highly significant: 55% (95% CI: 32 to 78%; P = 0.0003) [55]. This difference in Pvalues also illustrates that the calculation of the absolute effect, which is the custom in the Cochrane reviews, can lead to false negative conclusions.
Our study did not intend to reproduce Bonini’s main metaanalysis, which was labeled Analysis 1.1 in their paper [11]. There were several errors and data extraction inconsistencies, some of which were severe, see Additional file 1: Table S4. We used Bonini’s review as an example to demonstrate that the calculation of absolute effects can lead to suboptimal effect estimates. Similar to Bonini’s analysis, we combined different β_{2}agonists to calculate one single estimate of effect. We took this approach because our primary goal was to compare two different methods in the analysis of FEV_{1} changes rather than estimating the effectiveness of a particular β_{2}agonist, or a particular experimental protocol for conducting an exercise test. If one β_{2}agonist or protocol is less effective than another, the lower effectiveness would be analyzed in both ways and, thereby, would contribute equally to both the relative and absolute scale analysis. We tried to reduce the heterogeneity of comparisons by selecting salbutamol (or if not tested, salmeterol) when several β_{2}agonists were investigated in the same report, the shortest delay between β_{2}agonist administration and exercise test when exercise tests were repeated several times after the administration of a β_{2}agonist, and predrug FEV_{1} as baseline when possible. Furthermore, we took into account the variations in β_{2}agonists and the conduct of exercise tests used among different trials by using the β_{2}agonist and the trial as clustering variables in the analyses.
Friedrich et al. compared the relative and absolute scales for diverse continuous outcomes and showed that, on average, the relative scale led to lower heterogeneity compared with the absolute scale indicating that the former is more informative [5–7]. In addition, previous analyses demonstrated that the analysis of effects on the duration of diseases and comparable outcomes is more informative on the relative scale than on the absolute scale [8–10]. However, there are many different kinds of contexts where continuous outcomes are generated and, therefore, the relative scale is not always applicable. Apparently, one requirement for using the relative scale is that there is a relevant 0% to 100% scale for the measurement. Such requirements are not always satisfied. For example, there are no reasonable 0% target levels for body weight, body temperature or blood pressure. In such cases, the relative scale may not be ideal.
Since in many contexts the relative scale is more informative in the analysis of continuous outcomes, the option to use the relative scale should be made widely available in metaanalysis software so that researchers can compare and decide themselves which scale is most suitable for their particular outcome.
Conclusions
Compared with the absolute scale, the relative scale captures more effectively the variation in the effects of β_{2}agonists on exerciseinduced FEV_{1} declines. The absolute scale has been widely used in the analysis of FEV_{1} changes and it may have led to suboptimal statistical analysis in some cases. The choice between the absolute scale and the relative scale should be determined on the basis of biological reasoning and empirical testing to identify the scale that leads to lower heterogeneity. The relative scale option should be made available for metaanalysis software. Meanwhile the transformation to the relative scale can be easily calculated with spreadsheet programs and the transformed data can be analyzed with standard metaanalysis software.
Availability of data and materials
Data analyzed in this study are available in Additional file 2.
Abbreviations
 AIC:

Akaike Information Criterion
 IPD:

Individual patient data
 MD:

Mean difference
 RR:

Relative risk
 SD:

Standard deviation
 SE:

Standard error
References
 1.
Rehm J, Taylor B, Mohapatra S, Irving H, Baliunas D, Patra J, Roerecke M. Alcohol as a risk factor for liver cirrhosis: a systematic review and metaanalysis. Drug Alcohol Rev. 2010;29:437–45. https://doi.org/10.1111/j.14653362.2009.00153.x.
 2.
Engels EA, Schmid CH, Terrin N, Olkin I, Lau J. Heterogeneity and statistical significance in metaanalysis: an empirical study of 125 metaanalyses. Stat Med. 2000;19:1707–28. https://doi.org/10.1002/10970258(20000715)19:13%3C1707::AIDSIM491%3E3.0.CO;2P.
 3.
Review Manager (RevMan) [Computer program]. Version 5.3. Copenhagen: The Nordic Cochrane Centre, The Cochrane Collaboration. http://ims.cochrane.org/revman. Accessed 1 Aug 2019.
 4.
Johnston BC, AlonsoCoello P, Friedrich JO, Mustafa RA, Tikkinen KA, Neumann I, Vandvik PO, Akl EA, da Costa BR, Adhikari NK, Dalmau GM, Kosunen E, Mustonen J, Crawford MW, Thabane L, Guyatt GH. Do clinicians understand the size of treatment effects? A randomized survey across 8 countries. CMAJ. 2016;188:25–32. https://doi.org/10.1503/cmaj.150430.
 5.
Friedrich JO, Adhikari NK, Beyene J. Ratio of means for analyzing continuous outcomes in metaanalysis performed as well as mean difference methods. J Clin Epidemiol. 2011;64:556–64. https://doi.org/10.1016/j.jclinepi.2010.09.016.
 6.
Friedrich JO, Adhikari NK, Beyene J. The ratio of means method as an alternative to mean differences for analyzing continuous outcome variables in metaanalysis: a simulation study. BMC Med Res Methodol. 2008;8:32. https://doi.org/10.1186/14712288832.
 7.
Friedrich JO, Adhikari NK, Beyene J. Ratio of geometric means to analyze continuous outcomes in metaanalysis: comparison to mean differences and ratio of arithmetic means using empiric data and simulation. Stat Med. 2012;31:1857–86. https://doi.org/10.1002/sim.4501.
 8.
Hemilä H, Herman ZS. Vitamin C and the common cold: a retrospective analysis of Chalmers' review. J Am Coll Nutr. 1995;14:116–23. https://doi.org/10.1080/07315724.1995.10718483.
 9.
Hemilä H. Many continuous variables such as the duration of the common cold should be analyzed using the relative scale. J Clin Epidemiol. 2016;78:128–9. https://doi.org/10.1016/j.jclinepi.2016.03.020.
 10.
Hemilä H. Duration of the common cold and similar continuous outcomes should be analyzed on the relative scale: a case study of two zinc lozenge trials. BMC Med Res Methodol. 2017;17:82. https://doi.org/10.1186/s128740170356y.
 11.
Bonini M, Di Mambro C, Calderon MA, Compalati E, Schünemann H, Durham S, Canonica GW. Beta2agonists for exerciseinduced asthma. Cochrane Database Syst Rev. 2013;(10):CD003564. https://doi.org/10.1002/14651858.CD003564.pub3.
 12.
Parsons JP, Hallstrand TS, Mastronarde JG, Kaminsky DA, Rundell KW, Hull JH, Storms WW, Weiler JM, Cheek FM, Wilson KC, Anderson SD. American Thoracic Society Subcommittee on exerciseinduced bronchoconstriction. An official American Thoracic Society clinical practice guideline: exerciseinduced bronchoconstriction. Am J Respir Crit Care Med. 2013;187:1016–27. https://doi.org/10.1164/rccm.2013030437ST.
 13.
Cullum VA, Farmer JB, Jack D, Levy GP. Salbutamol: a new, selective betaadrenoceptive receptor stimulant. Br J Pharmacol. 1969;35:141–51. https://doi.org/10.1111/j.14765381.1969.tb07975.x.
 14.
Anderson JD, Seale JP, Rozea P, Bandler L, Theobald G, Lindsay DA. Inhaled and oral salbutamol in exerciseinduced asthma. Am Rev Respir Dis. 1976;114:493–500. https://doi.org/10.1164/arrd.1976.114.3.493.
 15.
Wahlbeck B. BetaAdrenoceptor agonists and asthma  100 years of development. Eur J Pharmacol. 2002;445:1–12. https://doi.org/10.1016/S00142999(02)017284.
 16.
Sears MR, Lötvall J. Past, present and future  beta2adrenoceptor agonists in asthma management. Respir Med. 2005;99:152–70. https://doi.org/10.1016/j.rmed.2004.07.003.
 17.
Anderson SD, Lambert S, Brannan JD, Wood RJ, Koskela H, Morton AR, Fitch KD. Laboratory protocol for exercise asthma to evaluate salbutamol given by two devices. Med Sci Sports Exerc. 2001;33:893–900. https://doi.org/10.1097/0000576820010600000007.
 18.
Boner AL, Spezia E, Piovesan P, Chiocca E, Maiocchi G. Inhaled formoterol in the prevention of exerciseinduced bronchoconstriction in asthmatic children. Am J Respir Crit Care Med. 1994;149:935–9. https://doi.org/10.1164/ajrccm.149.4.7908246.
 19.
de Benedictis FM, Tuteri G, Pazzelli P, Solinas LF, Niccoli A, Parente C. Combination drug therapy for the prevention of exerciseinduced bronchoconstriction in children. Ann All Asthma Immunol. 1998;80:352–6. https://doi.org/10.1016/S10811206(10)629821.
 20.
Henriksen JM, Agertoft L, Pedersen S. Protective effect and duration of action of inhaled formoterol and salbutamol on exerciseinduced asthma in children. J All Clin Immunol. 1992;89:1176–82. https://doi.org/10.1016/00916749(92)90302I.
 21.
Pearlman DS, Rees W, Schaefer K, Huang H, Andrews WT. An evaluation of levalbuterol HFA in the prevention of exerciseinduced bronchospasm. J Asthma. 2007;44:729–33. https://doi.org/10.1080/02770900701595667.
 22.
Robertson W, Simkins J, O’Hickey SP, Freeman S, Cayton RM. Does single dose salmeterol affect exercise capacity in asthmatic men? Eur Respir J. 1994;7:1978–84 http://erj.ersjournals.com/content/erj/7/11/1978.full.pdf.
 23.
de Benedictis FM, Tuteri G, Pazzelli P, Niccoli A, Mezzetti D, Vaccaro R. Salmeterol in exerciseinduced bronchoconstriction in asthmatic children: comparison of two doses. Eur Respir J. 1996;9:2099–103. https://doi.org/10.1183/09031936.96.09102099.
 24.
Green CP, Price JF. Prevention of exercise induced asthma by inhaled salmeterol xinafoate. Arch Dis Childhood. 1992;67:1014–7 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1793593.
 25.
Simons FE, Gerstner TV, Cheang MS. Tolerance to the bronchoprotective effect of salmeterol in adolescents with exerciseinduced asthma using concurrent inhaled glucocorticoid treatment. Pediatrics. 1997;99:655–9. https://doi.org/10.1542/peds.99.5.655.
 26.
Dinh Xuan AT, Lebeau C, Roche R, Ferriere A, Chaussain M. Inhaled terbutaline administered via a spacer fully prevents exerciseinduced asthma in young asthmatic subjects: a doubleblind, randomized, placebocontrolled study. J Internat Med Res. 1989;17:506–13. https://doi.org/10.1177/030006058901700602.
 27.
Henriksen JM, Dahl R. Effects of inhaled budesonide alone and in combination with lowdose terbutaline in children with exerciseinduced asthma. Am Rev Respir Dis. 1983;128:993–7. https://doi.org/10.1164/arrd.1983.128.6.993.
 28.
Debelic M, Hertel G, Konig J. Doubleblind crossover study comparing sodium cromoglycate, reproterol, reproterol plus sodium cromoglycate, and placebo in exerciseinduced asthma. Ann Allergy. 1988;61:25–9 https://www.ncbi.nlm.nih.gov/pubmed/3133964.
 29.
Walker SB, Bierman CW, Pierson WE, Shapiro GG, Furukawa CT, Mingo TS. Bitolterol mesylate in exercise induced asthma. J All Clin Immunol. 1986;77:32–6. https://doi.org/10.1016/00916749(86)903180.
 30.
Schoeffel RE, Anderson SD, Seale JP. The protective effect and duration of action of metaproterenol aerosol on exerciseinduced asthma. Ann Allergy. 1981;46:273–5 https://www.ncbi.nlm.nih.gov/pubmed/7235321.
 31.
Johnson JD. Statistical considerations in studies of exerciseinduced bronchospasm. J Allergy Clin Immunol. 1979;64:634–41. https://doi.org/10.1016/00916749(79)900277.
 32.
Senn S. The use of baselines in clinical trials of bronchodilators. Stat Med. 1989;8:1339–50. https://doi.org/10.1002/sim.4780081106.
 33.
Core Team R. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2019. https://www.rproject.org. Accessed 1 Aug 2019
 34.
Bates D, Maechler M, Bolker B, Walker S. Fitting linear mixedeffects models using lme4. J Stat Softw. 2015;67:1–48. https://doi.org/10.18637/jss.v067.i01.
 35.
Koenker R. quantreg: Quantile regression. R package version 5.35. 2018. https://CRAN.Rproject.org/package=quantreg
 36.
Higgins JP, Thompson SG. Quantifying heterogeneity in a metaanalysis. Stat Med. 2002;21:1539–58. https://doi.org/10.1002/sim.1186.
 37.
Hayes RJ. Methods for assessing whether change depends on initial value. Stat Med. 1988;7:915–27. https://doi.org/10.1002/sim.4780070903.
 38.
Ragland DR. Dichotomizing continuous outcome variables: dependence of the magnitude of association and statistical power on the cutpoint. Epidemiology. 1992;3:434–40.
 39.
Senn S. Disappointing dichotomies. Pharm Stat. 2003;2:239–40. https://doi.org/10.1002/pst.090.
 40.
Royston R, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006;25:127–41. https://doi.org/10.1002/sim.2331.
 41.
Fedorov V, Mannino F, Zhang R. Consequences of dichotomization. Pharm Stat. 2009;8:50–61. https://doi.org/10.1002/pst.331.
 42.
Stewart LA, Parmar MKB. Metaanalysis of the literature or of individual patient data: is there a difference. Lancet. 1993;341:418–22. https://doi.org/10.1016/01406736(93)93004K.
 43.
Berlin JA, Santanna J, Schmid CH, Szczech LA, Feldman HI. Individual patient versus grouplevel data metaregressions for the investigation of treatment effect modifiers: ecological bias rears its ugly head. Stat Med. 2002;21:371–87. https://doi.org/10.1002/sim.1023.
 44.
Lambert PC, Sutton AJ, Abrams KR, Jones DR. A comparison of summary patientlevel covariates in metaregression with individual patient data metaanalysis. J Clin Epidemiol. 2002;55:86–94. https://doi.org/10.1016/S08954356(01)004140.
 45.
Schwarzer G. Meta: an R package for metaanalysis. R News. 2009;7:40–5 https://cran.rproject.org/doc/Rnews/Rnews_20073.pdf.
 46.
Schwarzer G, Carpenter JR, Rucker G. Metaanalysis with R. London: Springer. https://doi.org/10.1007/9783319214160.
 47.
Spooner CH, Saunders LD, Rowe BH. Nedocromil sodium for preventing exerciseinduced bronchoconstriction. Cochrane Database Syst Rev. 2002;1:CD001183. https://doi.org/10.1002/14651858.CD001183.
 48.
Adams N, Lasserson TJ, Cates CJ, Jones PW. Fluticasone versus beclomethasone or budesonide for chronic asthma in adults and children. Cochrane Database Syst Rev. 2007;4:CD002310. https://doi.org/10.1002/14651858.CD002310.pub4.
 49.
Kew KM, Undela K, Kotortsi I, Ferrara G. Macrolides for chronic asthma. Cochrane Database Syst Rev. 2015;9:CD002997. https://doi.org/10.1002/14651858.CD002997.pub4.
 50.
Ni Chroinin M, Greenstone I, Lasserson TJ, Ducharme FM. Addition of inhaled longacting beta2agonists to inhaled steroids as first line therapy for persistent asthma in steroidnaive adults and children. Cochrane Database Syst Rev. 2009;4:CD005307. https://doi.org/10.1002/14651858.CD005307.pub2.
 51.
Chauhan BF, Ducharme FM. Antileukotriene agents compared to inhaled corticosteroids in the management of recurrent and/or chronic asthma in adults and children. Cochrane Database Syst Rev. 2012;5:CD002314. https://doi.org/10.1002/14651858.CD002314.pub3.
 52.
Farne HA, Cates CJ. Longacting beta2agonist in addition to tiotropium versus either tiotropium or longacting beta2agonist alone for chronic obstructive pulmonary disease. Cochrane Database Syst Rev. 2015;10:CD008989. https://doi.org/10.1002/14651858.CD008989.pub3.
 53.
Milan SJ, Hart A, Wilkinson M. Vitamin C for asthma and exerciseinduced bronchoconstriction. Cochrane Database Syst Rev. 2013;10:CD010391. https://doi.org/10.1002/14651858.CD010391.pub2.
 54.
Schachter EN, Schlesinger A. The attenuation of exerciseinduced bronchospasm by ascorbic acid. Ann Allergy. 1982;49:146–51 https://www.ncbi.nlm.nih.gov/pubmed/7114587.
 55.
Hemilä H. Vitamin C may alleviate exerciseinduced bronchoconstriction: a metaanalysis. BMJ Open. 2013;3:e002416. https://doi.org/10.1136/bmjopen2012002416.
Funding
No external funding. This research received no grant from any funding agency in the public, commercial or notforprofit sectors.
Author information
Affiliations
Contributions
HH planned the study, extracted the data, carried out the statistical analyses, and drafted the manuscript. JF confirmed the data extraction and participated in the revisions of the manuscript. Both authors read and approved the final manuscript.
Corresponding author
Correspondence to Harri Hemilä.
Ethics declarations
Ethics approval and consent to participate
Not applicable. This is a secondary analysis.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1. Details of the extraction of the IPD and studylevel data. Some errors and inaccuracies in the Bonini et al. [11] data extraction are described. Description of the calculations.
Additional file 2. The data sets used in this analysis, calculation of the 95% CIs for the ratios in Table 3, measurements from published figures to yield numerical extracted FEV_{1} values.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Hemilä, H., Friedrich, J.O. Many continuous variables should be analyzed using the relative scale: a case study of β_{2}agonists for preventing exerciseinduced bronchoconstriction. Syst Rev 8, 282 (2019). https://doi.org/10.1186/s1364301911835
Received:
Accepted:
Published:
Keywords
 Adrenergic beta2 receptor agonists
 Albuterol
 Ecological fallacy
 Exerciseinduced asthma
 Forced expiratory volume
 Metaanalysis
 Outcome assessment
 Randomized controlled trial
 Spirometry
 Statistics
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.