Many continuous variables should be analyzed using the relative scale: a case study of β2-agonists for preventing exercise-induced bronchoconstriction

Background The relative scale adjusts for baseline variability and therefore may lead to findings that can be generalized more widely. It is routinely used for the analysis of binary outcomes but only rarely for continuous outcomes. Our objective was to compare relative vs absolute scale pooled outcomes using data from a recently published Cochrane systematic review that reported only absolute effects of inhaled β2-agonists on exercise-induced decline in forced-expiratory volumes in 1 s (FEV1). Methods From the Cochrane review, we selected placebo-controlled cross-over studies that reported individual participant data (IPD). Reversal in FEV1 decline after exercise was modeled as a mean uniform percentage point (pp) change (absolute effect) or average percent change (relative effect) using either intercept-only or slope-only, respectively, linear mixed-effect models. We also calculated the pooled relative effect estimates using standard random-effects, inverse-variance-weighting meta-analysis using study-level mean effects. Results Fourteen studies with 187 participants were identified for the IPD analysis. On the absolute scale, β2-agonists decreased the exercise-induced FEV1 decline by 28 pp., and on the relative scale, they decreased the FEV1 decline by 90%. The fit of the statistical model was significantly better with the relative 90% estimate compared with the absolute 28 pp. estimate. Furthermore, the median residuals (5.8 vs. 10.8 pp) were substantially smaller in the relative effect model than in the absolute effect model. Using standard study-level meta-analysis of the same 14 studies, β2-agonists reduced exercise-induced FEV1 decline on the relative scale by a similar amount: 83% or 90%, depending on the method of calculating the relative effect. Conclusions Compared with the absolute scale, the relative scale captures more effectively the variation in the effects of β2-agonists on exercise-induced FEV1-declines. The absolute scale has been used in the analysis of FEV1 changes and may have led to sub-optimal statistical analysis in some cases. The choice between the absolute and relative scale should be determined based on biological reasoning and empirical testing to identify the scale that leads to lower heterogeneity.

: Extraction of IPD data of the 14 studies 3 Measurements of IPD findings of two studies from figures 4 Table S2: Extraction of the study means data 6   Printouts of statistical calculations 16

Explanations and Abbreviations:
Albuterol: a synonym in the USA for salbutamol FEV1: forced expiratory volume in 1 second (the volume a person is able to exhale in 1 s) IPD: individual participant data MDI: metered dose inhaler "1 hour test" indicates exercise test carried out 1 hour after the drug administration "Pre-drug as baseline" indicates that exercise-induced FEV1 decline is calculated from the FEV1 level before drug administration "Post-drug as baseline" indicates that exercise-induced FEV1 decline is calculated from the FEV1 level after drug administration

Extraction of IPD data of the 14 studies
The methods of 12 IPD studies were described by Bonini et al. (2013). The methods of the two studies listed below were not described by Bonini et al. Robertson (1994): 8 nonsmoking asthmatic men. They were all taking β2-agonists and regular inhaled corticosteroids. Inhaled corticosteroids were continued during the study. Double-blinded, cross-over study. Schoeffel (1981): 10 participants (3 male, 7 female) with asthma. They were all taking β2-agonists and some used inhaled corticosteroids. Single-blind randomized study.

Measurement of Simons (1997) results from Fig 2A
Simons reported the FEV1 levels (as % predicted) before treatment, and after treatment and exercise. Data for the same 14 participants are reported for both placebo (left) and salmeterol (right) tests, see the figure below. However, the lines overlap to such an extent that only 11 participants could be clearly identified for both the placebo and salmeterol tests. The 11 participants are indicated by the red lines and numbered from 1 to 11. See Supplementary file 2 for the measurement and calculation of the FEV1 changes in these 11 participants. Comparison of the mean and SD values we measured from the published figures and Simons report indicates close similarity in the means, see below. Thus, we were able to capture most of the findings. Simons

Extraction of the study means data
The following Table S2 describes the specific time points and the comparisons, from which we  extracted the FEV1 changes in the placebo and β2-agonist tests.  The studies with IPD are listed to make this list consistent with Bonini's Analysis 1.1. but the IPD  estimates are not added to this table, see Table S1. Two parallel-group studies (Kemp 1994 andVazquez 1984) are not included in our analysis. The number of participants in the cross-over studies is indicated by N.   Egglestone (1981) Terbutaline 500 μg  In each of the three scales, the 95% CI was calculated as the effect ± 1.96×SE. Therefore, each confidence interval is symmetric on the scale shown in Fig. 5.

Data extraction inconsistencies and errors in Bonini et al. (2013)
Our study did not intend to reproduce Bonini's main meta-analysis which was labeled Analysis 1.1 in their paper [11]. There are some errors and inaccuracies in the data extraction by Bonini and therefore exact reproduction of their Analysis 1.1 is not possible or relevant. Table S4 below describes the differences between Bonini's data extraction and ours.
Some of the errors are particularly large. In the Bronsky (1995) and the Del Col (1993) trials, Bonini added 10 and 20 percentage points to the published FEV1 declines in the β2-agonist tests, see below.
In particular, given that the effect of β2-agonists decreases over time, for included studies that reported on exercise tests at various times after the administration of the β2-agonist, we chose the shortest reported time after β2-agonist administration. Of the 44 studies we included in our analysis, 39 (87%) published data of exercise test that was carried out within 1 hour after drug administration, and the others were carried out within 3 hours, except Carlsen (1995) which reported only the 10-12 hour exercise test.
As an example of misleading data extraction by Bonini [11], Kemp (1994) compared salbutamol and salmeterol in three exercise tests that were carried out 0.5, 5.5, and 11.5 hours after the administration of the β2-agonist. In each time point, the FEV1 decline was smaller after salmeterol than after salbutamol: 5% vs. 7% declines in the 0.5 hour test, 8% vs. 25% in the 5.5 hour test, and 13% vs. 27% in the 11.5 hour test, respectively. This means that at each time point salmeterol had a greater effect than salbutamol. However, in their Appendix 3, Bonini extracted the salbutamol FEV1 decline from the 0.5 hour test (i.e. 7% FEV1 decline) but the salmeterol FEV1 decline from the 11.5 hour test (i.e. 13% FEV1 decline) and thereby gives a biased impression that salbutamol was better than salmeterol because a smaller FEV1 decline occurred after salbutamol. Such different time points were selected also for many other β2-agonist comparisons, see below. Such arbitrary selection of exercise test times biases the presentation and analysis in the Bonini review.