Open Access
Open Peer Review

This article has Open Peer Review reports available.

How does Open Peer Review work?

Integrating multiple data sources (MUDS) for meta-analysis to improve patient-centered outcomes research: a protocol for a systematic review

  • Evan Mayo-Wilson1Email author,
  • Susan Hutfless1, 2,
  • Tianjing Li1,
  • Gillian Gresham1,
  • Nicole Fusco1,
  • Jeffrey Ehmsen3,
  • James Heyward1,
  • Swaroop Vedula4,
  • Diana Lock5,
  • Jennifer Haythornthwaite6,
  • Jennifer L. Payne7,
  • Theresa Cowley8,
  • Elizabeth Tolbert9,
  • Lori Rosman10,
  • Claire Twose10,
  • Elizabeth A. Stuart11,
  • Hwanhee Hong11,
  • Peter Doshi12,
  • Catalina Suarez-Cuervo5,
  • Sonal Singh13 and
  • Kay Dickersin1
Systematic Reviews20154:143

https://doi.org/10.1186/s13643-015-0134-z

Received: 21 May 2015

Accepted: 15 October 2015

Published: 2 November 2015

Abstract

Background

Systematic reviews should provide trustworthy guidance to decision-makers, but their credibility is challenged by the selective reporting of trial results and outcomes. Some trials are not published, and even among clinical trials that are published partially (e.g., as conference abstracts), many are never published in full. Although there are many potential sources of published and unpublished data for systematic reviews, there are no established methods for choosing among multiple reports or data sources about the same trial.

Methods

We will conduct systematic reviews of the effectiveness and safety of two interventions following the Institute of Medicine (IOM) guidelines: (1) gabapentin for neuropathic pain and (2) quetiapine for bipolar depression. For the review of gabapentin, we will include adult participants with neuropathic pain who do not require ventilator support. For the review of quetiapine, we will include adult participants with acute bipolar depression (excluding mixed or rapid cycling episodes). We will compare these drugs (used alone or in combination with other interventions) with placebo or with the same intervention alone; direct comparisons with other medications will be excluded. For each review, we will conduct highly sensitive electronic searches, and the results of the searches will be assessed by two independent reviewers. Outcomes, study characteristics, and risk of bias ratings will be extracted from multiple reports by two individuals working independently, stored in a publicly available database (Systematic Review Data Repository) and analyzed using commonly available statistical software. In each review, we will conduct a series of meta-analyses using data from different sources to determine how the results are affected by the inclusion of data from multiple published sources (e.g., journal articles and conference abstracts) as well as unpublished aggregate data (e.g., “clinical study reports”) and individual participant data (IPD). We will identify patient-centered outcomes in each report and identify differences in the reporting of these outcomes across sources.

Systematic review registration

CRD42015014037, CRD42015014038

Keywords

Systematic reviews Meta-analysis Reporting bias Publication bias Guidance Gabapentin Pain Quetiapine Depression Bipolar disorder

Background

Multiple sources of data

Systematic reviews and meta-analyses are comparative effectiveness research methods that involve summarizing existing research to establish how well treatments work. Systematic reviews should provide trustworthy guidance to decision-makers, but their credibility is challenged by the selective reporting of trial results and outcomes. Some trials are not published, and even among clinical trials that are published partially (e.g., as conference abstracts), many are never published in full [1].

Failure to publish is not random. Studies favoring the comparator or null findings are less likely to be published compared with studies that favor the test treatment, known as publication bias [2]. Even when studies are published, authors may report selectively the statistically significant outcomes favoring the test treatment; they may not publish outcomes favoring the comparator or outcomes for which there were no statistical differences observed between treatments. This is known as selective outcome reporting bias [3]. Additionally, analyses of reported trials are often not done on an “intention to treat” basis, in which all randomized patients are analyzed as part of the group to which they were assigned, leaving such results vulnerable to selection bias despite randomization [4]. Readers may estimate results for dichotomous outcomes with missing cases by assuming that participants did or did not improve, and systematic reviewers can conduct sensitivity analyses to explore how continuous variables are affected by post-randomization exclusions, but the validity of the assumptions underlying these methods may be difficult to test.

Failure to report research is a tremendous waste [5]. In addition, and of arguably more importance, treatment decisions on the basis of biased reporting may harm people by the prescription of treatments that are less effective and more harmful than what the systematic reviews suggest. Thus, to decrease the threat of reporting biases, current best practices for the conduct of systematic reviews include searching for unpublished trial results as well as the “grey” literature such as conference abstracts [6].

There are many potential sources of published and unpublished data, and there are no established methods for choosing among multiple reports or data sources about the same trial. Moreover, most reports such as journal publications and trial registries include only data summaries at the group level (i.e., aggregate data), and other reports (e.g., conference abstracts, posters, and regulatory packets submitted for approval) may include only fragmentary and incomplete study details. Some sources, such as conference abstracts, may be so selectively reported that their inclusion in a meta-analysis could increase rather than reduce bias [7]; however, supplementing short reports with additional information from trial registries could provide a more complete account of trials when longer reports are unavailable [8]. Individual participant data (IPD), by contrast, may include more details than reports of aggregate results, and IPD can be re-analyzed when patient data have been omitted from published analyses. However, IPD are rarely available for all included studies in systematic reviews.

A few studies have examined possible methods for supplementing information from published reports. For example, studies have compared the reliability of meta-analytic estimates in systematic reviews using data submitted to the Food and Drug Administration (FDA) and published data [9]. There are numerous cases in which unpublished trials have come to light and summary data from those studies have been used in systematic reviews about clinical efficacy, but these cases typically pick a perceived “best” source for each meta-analysis and do not show how meta-analyses would be affected by the inclusion of data from different sources [1012]. These studies show that when information is present from multiple sources, it does not always agree, and there are no guidelines for choosing which data to include in a systematic review under these circumstances.

IPD are generally considered the best data for performing traditional meta-analysis, not aggregate (summary) data [13]. While authors have noted that “individual participant data are not needed if all the required aggregate data can be obtained in full,” the reality is that reviewers rarely know when this is the case [14]. In addition, analyzing IPD without detailed attention to other elements of study design (such as details about data collection) can lead to superficial and erroneous interpretations of results. One study compared differences between meta-analyses using published data and meta-analyses using IPD [12, 15], and a second meta-analysis using the same data noted that internal correspondence and other documents about the trials would bring further insight [11], suggesting that even more data sources might be useful in systematic reviews that already include individual participant data.

Given the potential for meta-bias [3], it seems obvious that reviewers should search for and include all relevant and reliable data in systematic reviews. On the other hand, comprehensive searching adds to the time and resources required to complete systematic reviews. Despite the potential value of individual-level data, at this time, it is unclear if the results of systematic reviews using IPD necessarily differ substantively from reviews based on reports of aggregate data. Thus, it is not known if the additional resources required for identifying, obtaining, and analyzing each type of unpublished data are worthwhile. Empirically grounded guidance is needed to guide reviewer choices about the use of data from multiple sources.

Patient-centered outcomes

Patient-centered outcomes research helps people “communicate and make informed healthcare decisions” (http://www.pcori.org/assets/March-5-Definition-of-PCOR11.pdf), yet many clinical trials and systematic reviews do not fulfill these goals. A primary reason for this deficiency is a historical lack of attention to the selection of patient-centered outcomes for analysis. This project seeks to identify patient-centered outcomes for two systematic reviews using a combination of methods that other reviewers could replicate to improve the patient-centeredness of their research.

Across systematic reviews of a topic, there is a tendency to focus on questions that are answerable and to report outcomes that are available in published reports. This happens even when systematic review authors would prefer to focus on outcomes they consider important but which do not appear in publication. When this kind of availability bias occurs, results published in reports of clinical trials and, thus in systematic reviews, may not be the most meaningful outcomes for people with health problems.

Randomized trials are often expensive (largely because of the effort involved in ensuring high-quality data collection and follow-up), and trialists typically collect more data than they can report in journal publications. If searching for unpublished reports results in the identification of patient-centered outcomes that were recorded but not included in trial publications, then these efforts might improve the quality and utility of systematic reviews by aiding the inclusion of patient-centered outcomes. As far as we are aware, this possibility has never been addressed. Examining such data sources for patient-centered outcomes could reduce the need for additional studies and improve the efficiency of patient-centered outcomes research.

Similarly, reports of clinical trials and systematic reviews typically focus on one time point (e.g., the end of treatment or longest follow-up). From a patient’s perspective, these time points may or may not relate to the natural course or treatment of their problem.

Objectives

Our objective is to explore the reliability, validity, and utility of incorporating data from multiple data sources. We will assess the impact of using various data sources on effect estimates for efficacy and harms, and on clinical inference, in two high-impact case studies.

To examine the sensitivity of conclusions to the data sources used, we will conduct sequential meta-analyses in which we systematically replace data from less complete sources (e.g., conference abstracts) with data from more complete sources (e.g., journal articles and internal company documents) and with IPD for two systematic reviews. We will evaluate the validity of meta-analyses using these sources by comparing (1) the risk of bias for each analysis and (2) the average effects of meta-analyses based on these sources. We will describe the utility of using additional data sources in meta-analyses, including the information gained by including short reports (e.g., conferences abstracts) and unpublished data (e.g., internal company reports and IPD) in addition to journal articles. We will examine the reliability of sources by comparing outcomes and effects across multiple reports of trials.

Methods

Selection of case studies

To evaluate the effect of using data from multiple sources in systematic reviews, we will conduct reviews of the effectiveness and safety of (1) gabapentin (Neurontin®) for neuropathic pain in adults and (2) quetiapine (Seroquel®) for the treatment of depression in adults with bipolar disorder. We will use similar methods for both reviews, as described in this section. The specific inclusion and exclusion criteria will reflect the important clinical issues in each area and are described in the sections that follow. Both reviews will be conducted according to IOM standards [16]. These reviews will be used as case studies to explore the use of multiple data sources.

These cases were selected for several reasons. Firstly, gabapentin and quetiapine are used commonly for these respective conditions, so the included studies will be clinically important. Secondly, pain and depression are associated with several patient-reported outcomes. Self-reported outcomes related to patient perception share common features and complexities in their measurement, so methodological guidance from this project will be relevant to other areas of patient-centered research. Trials in such areas also commonly record outcomes that are not patient-centered [17], and we aim to identify the extent to which the outcomes in multiple data sources are patient-centered. Methodologically, these cases will also allow us to consider different situations that systematic reviewers might face with respect to data from multiple sources.

Gabapentin was initially approved for the treatment of epilepsy. At the time the medication was developed, it was not common to register clinical trials prospectively. Furthermore, much of the use of gabapentin for neuropathic pain has been off-label for indications not approved by FDA, and data about the use of gabapentin for these indications were not submitted to regulators to our knowledge. Although we do not expect to find that many trials of gabapentin for neuropathic pain were publicly registered, there is evidence of selective outcomes reporting and publication bias in trials of gabapentin for neuropathic pain [18]. Multiple data sources are available for several trials as a consequence of litigation for which one of the authors (KD) was an expert witness. A list of trials conducted by the developer, internal company documents (Inferential Analysis Plans, Research Reports, and memos) and databases containing individual patient data were provided by Pfizer to the plaintiffs’ lawyers without codebooks, and these were then given to KD to assist with her testimony.

By comparison, quetiapine was initially approved for the treatment of psychotic disorders and later approved for the treatment of bipolar disorder. Several trials of quetiapine for bipolar disorder were registered prospectively. Although the drug is used off-label, the prescription of quetiapine for bipolar disorder has been largely on-label (i.e., this indication was approved by FDA). There is also evidence of publication bias among trials of antipsychotics including quetiapine [19]. These cases thus represent two different and important situations that reviewers might encounter with different types of medications and access to multiple data sources.

Types of studies

We will include randomized controlled trials. We will exclude N-of-1 trials, observational studies, quasi-randomized controlled trials (e.g., alternating allocation), and non-randomized studies. We will exclude studies in which providers or participants were aware of group assignment (i.e., open-label studies). Studies will be considered for inclusion regardless of publication status or language of publication.

Current guidelines suggest that parallel group and crossover studies can be combined for analysis if the crossover design is appropriate for the condition and intervention under investigation [11], though poor reporting often limits their inclusion in meta-analysis [20, 21]. For this review, we will identify crossover studies, but we will not include them in the meta-analysis. Neuropathic pain is relatively stable and likely to return in the absence of effective therapy, but short-term crossover studies may have limited clinical relevance for a chronic condition. Bipolar depression is an unstable condition and antipsychotics are not well tolerated, so withdrawals from the first period of a study could make a second period uninterpretable. Crossover studies that are otherwise eligible will be described in the excluded studies.

We will analyze studies enrolling people who are not taking the study drug prior to the start of the trial (i.e., we will include only studies of people initiating treatment with gabapentin or quetiapine). Discontinuation studies will be described but not analyzed. The rationale for this decision is that the efficacy and safety of medications may differ for studies randomizing (1) people who are treatment naïve and (2) people who have responded to a study drug. For example, discontinuation studies may enroll people already taking the study drug and randomly assign them to continue taking it or switch to placebo, thus excluding people who do not respond to the treatment and people who experience adverse events and discontinue treatment.

Comparison interventions

Studies will be included if gabapentin or quetiapine is the only intervention that varies between treatment groups. That is, we will include studies of each medication in combination with other therapies compared with the other therapies alone. In factorial studies making more than one eligible comparison (e.g., A versus B and AC versus BC), we will treat these as separate comparisons (rather than combine intervention and control groups within trials).

Comparisons of different doses or formulations of the same drug, comparisons with other treatments, and discontinuation studies will not be analyzed for this report.

Identifying patient-centered outcomes

In the funding application for this study, we described plans to select outcomes and time points in collaboration with patient and stakeholder partners. As planned, we created a list of symptoms and outcomes that matter most to patients taking gabapentin for neuropathic pain and to patients taking quetiapine for bipolar depression. We identified patient-centered outcomes to be examined using the following methods:
  1. 1.

    We examined the website “PatientsLikeMe” (http://www.patientslikeme.com/), which is a website to identify and record patient-identified outcomes reported by patients. PatientsLikeMe allows people to enter information about their medical history and interventions they have used. People can describe their experience with interventions, including effectiveness and adverse effects. We did not use the information on effectiveness because it is reported in a format that includes the percentage of people who rated the subjective effectiveness in different categories (e.g., very effective, not effective at all) without clearly defining outcomes in a way that would allow us to compare this information with the results from the trials. Information about adverse- effects is reported as the six most commonly rated effects in a report for each medication, which we will identify and include in our reviews.

     
  2. 2.

    We examined the compendium “DRUGDEX” (http://micromedex.com/), which is used by health care providers to make treatment decisions. DRUGDEX is a commercial website. The Centers for Medicare and Medicaid (CMS) may use DRUGDEX ratings to make reimbursement decisions related to medically accepted, FDA unapproved anti-cancer treatment regimens. DRUGDEX contains a list of each drug’s adverse effects by organ system.

     
  3. 3.

    For pain outcomes, we reviewed the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) consensus recommendations [22]. The mission of IMMPACT is to develop consensus reviews and recommendations for improving the design, execution, and interpretation of clinical trials of treatments for pain (http://www.immpact.org/). The IMMPACT group includes researchers, manufacturers, and people with chronic pain.

     
  4. 4.

    We used PubMed, patient websites, PCORI-funded projects, and the James Lind Alliance to identify additional outcomes.

     
  5. 5.

    Patient and stakeholder partners identified outcomes they thought should be included in each review.

     

We discussed the outcomes identified through all sources and selected those that patient and stakeholder partners thought were most patient-centered. The outcomes for each review are included in the sections that follow.

Identifying patient-centered time points

Working with patients and clinicians, we also identified time points that are important in these conditions. Neuropathic pain and bipolar disorder are both chronic conditions, and patients with these conditions have indicated that long-term outcomes are more meaningful than short-term outcomes. However, acute treatment is related to long-term management in both cases, so investigators might reasonably focus on either short- or long-term results. For people with bipolar disorder, interventions that are effective for an acute episode may be used prophylactically over longer periods of time. Most studies about bipolar disorder are designed to measure recovery from an episode rather than the long-term prevention of relapses. Because a drug that is ineffective during an acute episode would not typically be continued, long-term studies often randomize people to continue taking a drug or to discontinue a drug to which they responded during an episode of mania or depression. For people with chronic pain, it may be impossible to predict who will respond to a given treatment, so people often try a drug for a short period and continue treatment if it is associated with symptom relief and if it is well tolerated. Indeed, a recent Cochrane review recommends that gabapentin be used this way for the treatment of chronic pain [23]. As above, we discussed possible time points with patient and stakeholder partners. The times they thought were most patient-centered for each review are included in the sections that follow.

Search methods for identification of studies

We will conduct electronic and additional searches to identify studies, the results of which will be reported following the PRISMA guidelines [24]. We will search for the following types of reports (listed by their approximate level of detail):
  1. 1.

    Study registrations in publicly available databases (e.g., www.clinicaltrials.gov)

     
  2. 2.

    Study protocols and statistical analysis plans

     
  3. 3.

    Short reports (e.g., conference abstracts and posters)

     
  4. 4.

    Summary data posted on trial registries

     
  5. 5.

    Peer-reviewed journal articles

     
  6. 6.

    Dissertations (e.g., masters or doctoral theses)

     
  7. 7.

    Unpublished manuscripts and reports (e.g., reports to funders and clinical study reports)

     
  8. 8.

    Information sent to regulators (e.g., data sent to FDA)

     
  9. 9.

    Individual participant data

     

Electronic searches

We will search electronic databases, including the Cochrane Central Register of Controlled Trials (CENTRAL), CINAHL, Embase, Lilacs, and PubMed. We will search Medline and PsycInfo for the review of quetiapine for bipolar depression. In addition, we will search the International Clinical Trials Registry Platform Search Portal (ICTRP) and ClinicalTrials.gov to identify study protocols and results [25], using generic drug names and the trade names identified through Micromedex. We will remove duplicates from ClinicalTrials.gov from the ICTRP results.

Searching other resources

Reference lists of systematic reviews and included studies will be checked for additional reports. We will contact authors of included studies to request additional study reports. We will also contact manufacturers and search their websites to identify reports.

Regulatory data

We will search for summary data for studies meeting our eligibility criteria from the FDA website (Drugs@FDA). We will also search the websites of foreign regulators, including the European Medicines Agency (EMA), Medicines and Healthcare Products Regulatory Agency (MHRA, UK), Therapeutic Goods Administration (TGA, Australia), and the Pharmaceuticals and Medical Devices Agency (PMDA, Japan). From each organization’s website, we will download the approval letter and related documents for the relevant indications [26].

On the Drugs@FDA website, we will enter generic drug names as search terms to identify all related products. We will then screen records related to each product to identify potentially relevant documents. Medical and statistical reviews that might include information about the methods of results of eligible trials will be extracted, and we will review the most recent label for information about eligible trials.

We will write to the FDA and EMA to request any information they have about the trials we have identified through our searches (e.g., clinical study reports and individual participant data), and we will request details of any other known studies meeting the inclusion criteria.

Individual participant data

For all trials identified, we will request de-identified individual participant data and associated documentation from the study authors and/or sponsor, unless we already have these files through an alternative source. For example, individual participant data for several gabapentin trials have been provided as Microsoft Access files to Kay Dickersin, who provided expert testimony in litigation against Pfizer, but no codebooks were provided with them. We will request further details about these studies (including codebooks to verify definitions of variables) as appropriate given ongoing litigation and settlement agreements.

We will search for data that have been made publicly available by manufacturers (e.g., through their websites). We will also search for data that have become available through other means (e.g., litigation) using the Drug Industry Document Archive (DIDA), Yale University Open Data Access (YODA), and PsychRights.org.

Selection of studies

Two reviewers will independently screen titles and abstracts identified by the electronic searches to determine which are eligible for inclusion in the review. We will then obtain and independently screen the full text of all potentially relevant studies to determine whether they meet the inclusion criteria. If investigators disagree about the eligibility of a report, they will discuss the disagreement with a third investigator to reach consensus about the study’s eligibility.

If a study cannot be included or excluded based on the information available in all reports associated with the study, we will contact the study authors and/or sponsor for more information to determine eligibility for our review.

During the study selection process, we will not be masked to study authors, institutions, journal of publication, or results.

Results of the search will be documented using modified PRISMA flowcharts [24].

Outcomes

Reports may include both systematically recorded outcomes (i.e., those that have been recorded using standardized measures given to all participants) and spontaneously recorded outcomes (e.g., unexpected outcomes reported by participants or providers).

Systematically recorded outcomes

From each report, we will record the outcomes, and we will record if they are identified as “primary,” “secondary,” and “safety,” using another definition, or not defined classified.

We will record five elements for each outcome as they are described in each report [27, 28]:
  1. 1.

    Domain (outcome title);

     
  2. 2.

    Specific measure (specific scale or instrument);

     
  3. 3.

    Specific metric (format of the outcome data, such as value-at-a-time or mean change from baseline);

     
  4. 4.

    Method of aggregation (how data from each group will be summarized, such as mean or percent); and

     
  5. 5.

    Time point (e.g., weeks or months since randomization).

     

In addition to these elements, we will record details about methods of analysis (e.g., handling of missing data) and the definition of the population for analysis (e.g., study completers or people starting treatment).

We will extract and analyze results for key domains. These domains will be selected because they are (1) commonly measured, (2) likely to be reported selectively based on previous research, or (3) important to patients. The reporting of other outcomes will be described, but meta-analyses will not be conducted.

Spontaneously reported adverse events

In choosing a treatment for neuropathic pain or bipolar disorder, differences among available drugs in risk of adverse events may be more important than differences in average efficacy.

Adverse events can be recorded systematically using tests or questionnaires; however, trials often record only adverse events that patients report to doctors or investigators (e.g., on a case report form), which produces data that are difficult to analyze within and across trials. A problem related to this method of data collection is that adverse events may be reported in ad hoc and selective fashion in clinical trials; thus, systematic reviews may not pre-specify adverse outcomes or collect adverse event data systematically. Even when systematic review authors make every effort to assess effectiveness outcomes using scientific methods, they may not be able to apply the same standards to the synthesis of adverse events [29].

In these reviews, we will extract detailed information about adverse events from all reports to identify similarities and differences across reports of clinical trials. In addition to other relevant information (such as the number of people assigned and included in each analysis), we will extract the following information about adverse events:
  1. 1.

    Number (proportion) of participants experiencing one or more adverse events

     
  2. 2.

    Number (proportion) of participants who discontinued treatment because of adverse events

     
  3. 3.

    Number (proportion) of participants who discontinued treatment for any reason

     
  4. 4.

    Number of serious adverse events (i.e., those that could be classified as serious by the FDA, including death, life-threatening events, hospitalization, disability or permanent damage, and important medical events)

     
  5. 5.
    Specific adverse events. Where possible, these will be recorded following the classification systems used by developers at the time trials were conducted. Most clinical trials would have been analyzed using either the Coding Symbols for Thesaurus of Adverse Reaction Terms (COSTART) or the Medical Dictionary for Regulatory Activities (MedDRA), depending on the time they were conducted.
    1. 1.
      For COSTART version IV [30], these are the following:
      1. (a)

        Body system: “Essentially anatomic, this body system classification is sometimes the basis of search strategy. The classification is hierarchical in nature.”

         
      2. (b)

        Body system subcategories

         
      3. (c)

        Mid-level system: “a mid-level pathophysiologic classification of COSTART for purposes of categorizing and retrieving information based on disease associations.” “This section is hierarchical in arrangement, allowing one to be very general or more specific and is a convenient strategy for searching for drug-induced disease.”

         
      4. (d)

        Mid-level system subcategories

         
      5. (e)

        Preferred term: A 20-character code used to identify events

         
       
    2. 2.
      For MedDRA [31], these are the following:
      1. (a)

        System organ class (SOC): “the highest level of the hierarchy that provides the broadest concept for data retrieval.” Data are grouped by etiology, manifestation site, and purpose. To avoid double-counting preferred terms (PTs) assigned to more than one SOC, we will use only the “primary” SOC.

         
      2. (b)

        High-level group term (HLGT): “a superordinate descriptor for one or more HLTs related by anatomy, pathology, physiology, etiology, or function”

         
      3. (c)

        High-level term (HLT): a superordinate category that “links PTs related to it by anatomy, pathology, physiology, etiology, or function.”

         
      4. (d)

        Preferred term (PT): “a distinct descriptor (single medical concept) for a symptom, sign, disease, diagnosis, therapeutic indication, investigation, surgical, or medical procedure, and medical, social, or family history characteristic.”

         
      5. (e)

        Lowest level term (LLT): synonyms for a preferred term.

         
       
     

Results coded using COSTART will be analyzed at the level “Preferred term,” “Mid-level system,” and “Body system.” Results coded using MedDRA will be analyzed at the level “Preferred term” and above. Additionally, we will use searches designed to identify clusters of related symptoms that may not be identifiable using the standard hierarchy [32].

Adverse events could have been recorded using another hierarchical system, such as WHOART, and results will be analyzed using these systems where appropriate. For reports that do not describe adverse events using a structured classification system, we will record events as they are described in the reports. Where possible, we will also compare the terms given to the original reporter (e.g., as written on a case report form) and the preferred terms with which these were associated.

Time points for analysis

We will extract the time at which each outcome was assessed as described in each report, and we will describe the planned duration of the included trials.

Effectiveness outcomes (i.e., data about benefits) will be organized into results at 8 weeks post-randomization (time window 4–13 weeks), 18 weeks (time window 14–22 weeks), 27 weeks (time window 23–31 weeks), and longer times where possible. For each review, we plan to meta-analyze data for the 8-week time window because patients may decide if they wish to continue using a medication during this interval. Additionally, we expect to have the most data for analysis in the short-term time window. If sufficient data are available, we will also analyze outcomes at other times. For each time window, we will describe how all elements for these outcomes were reported in each source.

For each of these time windows, key outcomes will be analyzed in sequential meta-analyses comparing combined effects using different data sources. If results are reported for a study at multiple times within a window, we will use the time closest to 8, 18, or 27 weeks for meta-analysis.

Adverse events will be analyzed for the same times as the effectiveness data where possible. Additionally, we will analyze adverse events occurring closer to randomization (e.g., 3–4 weeks), which may be important for understanding early discontinuation and compliance.

Because we will analyze only comparators that do not include the test intervention (e.g., placebo), we do not expect to find many long-term studies for both ethical and practical reasons. For example, people who do not respond to an intervention or placebo after several months may be unlikely to continue participating in a trial; even if a trial were to continue beyond acute treatment, missing data would make long-term results difficult to interpret.

Data collection

For aggregate data

Data collection forms will be developed and entered in the Systematic Review Data Repository (SRDR). These will be made available through SRDR at the end of the study.

Data collection forms will be pilot tested. After the initial versions of the forms have been finalized, data will be extracted from each report into SRDR. For studies associated with multiple reports (e.g., journal article and internal company documents), data will be extracted from each report for comparison. From each source, we will record details about the report, study design, start and end dates, inclusion and exclusion criteria, characteristics of each intervention, participant demographic and clinical characteristics, outcomes, risk of bias, and sponsor.

Data will be extracted independently by two reviewers. Discrepancies will be resolved by consensus and through discussion with a third reviewer if necessary. The final dataset will be made publicly available through SRDR at the end of the study.

For individual participant data

Individual participant data will be accepted in any format, and these will be analyzed if possible. Where it may be possible to obtain data in various formats, we will state our preference for receiving data based on our familiarity with relevant software and the desirability for concordance across studies.

Because we do not have codebooks for some gabapentin databases, it is not clear that we will be able to include all IPD from available sources in our analysis. Where possible, we will compare the available databases with case report forms (which often show how and when data were recorded) and statistical analysis plans (which often show how data were coded and analyzed) to identify the variables contained in the databases. From our initial review, we can see that the databases include participant numbers, diagnostic information, information about study site, dates of medical visits, and details about the outcomes and the types and times of adverse events.

We have also identified some individual participant data in clinical study reports (CSRs) about quetiapine for bipolar depression. These reports mainly include aggregate results, but some reports also contain data for individual patients, which are organized in tables by participant number. We are currently working to extract information from these reports in an analyzable format. As with gabapentin, the reports include diagnostic information, information about study site, dates of medical visits, and details about the types and times of adverse events. Where possible, we will use ABBYY FineReader software to extract data from PDF files into spreadsheets for analysis [33].

Whether codebooks are available through the authors or we have had to reconstruct them, we will check the data for accuracy. We will first attempt to recreate tables found in the available reports to examine whether we have correctly identified study variables, including those associated with baseline characteristics and results. Where data have been reformatted for analysis, will also check a sample of the cells against the original source for accuracy.

Trials of these drugs have been conducted over two decades, and it may not be possible to include all IPD in our analyses from individual investigators because of the number of datasets received or because some data formats cannot be merged for analysis (e.g., it may not be possible to combine older formats and newer formats). If we are not able to include all of the data in the meta-analysis because they cannot be synthesized in the time available, we will describe datasets not included in the analysis.

Confidentiality

De-identified data will be collated in a common database for each review using fields that are consistent across trials where possible. Until completion of our study, data will be kept on a local network to which only people working on this project have access. Unless we find that participants could be identified using these data, we will make databases and codebooks publicly available following the completion of our study.

In the event that any personally identifiable information is found, identifying information will be removed from the data following Health Insurance Portability and Accountability Act (HIPAA) guidelines before sharing the files with others.

We obtained IPD for trials of gabapentin as a result of litigation for which Kay Dickersin was an expert witness. Databases (Microsoft Access files) and documents (PDF files) containing IPD were provided by Pfizer to the plaintiffs’ lawyers without codebooks, and these were then given to Kay Dickersin to assist with her testimony. The data have been unsealed, and Pfizer has waived claims of confidentiality. The quetiapine clinical study reports include dates of medical tests and narratives about adverse events that describe the age, sex, and race of participants; we have also located patient initials in some of the reports. The clinical study reports that include individual participant data for quetiapine were made publicly available by the plaintiffs’ attorneys following a product liability lawsuit, and these are already available on the Internet.

Assessment of risk of bias in included studies

To compare the completeness of reports with respect to the methodological information they contain, each report will be assessed using the Cochrane Collaboration Risk of Bias Tool [34]. Two reviewers will independently rate each report for risk of bias related to sequence generation, allocation concealment, masking of participants, masking of outcome assessors, masking of providers, and incomplete data. We will not rate risk of selective outcomes reporting in each report, but we will rate each trial for risk of reporting bias. For each domain, risk of reporting bias will be described as high, low, or unclear. Discrepancies will be resolved by consensus and through discussion with a third reviewer if necessary.

Dealing with missing data

Missing data are unrecorded values that could be meaningful for the analysis and interpretation of a study [35]. In clinical trials, data may be missing for outcomes or for covariates. For outcomes, a participant who skips questions on a measure at a given time point may complete related questions and outcome measures; in such cases, missing items (i.e., questions) may be imputed to calculate the overall scores for measures (i.e., questionnaires). In other cases, outcome measures may be missing in their entirety; missing outcome measures may be related to missed assessments (e.g., missed visits) or to discontinuation (e.g., study dropout). For all outcomes, we will report the amount of missing data and the reasons for missinginess. Additionally, the following sections describe how we will handle missing outcomes in reports of aggregate data and how we will handle IPD with missing outcomes, including missing items and missing outcome measures.

In reports of aggregate data

When aggregate analyses of continuous outcomes are reported only for people providing outcome data as well as controlling for missing data (e.g., using multiple imputation), we will analyze the latter. For dichotomous measures of treatment efficacy, we will conduct an analysis in which we assume that participants did not respond to treatment if their outcomes are missing. For dichotomous measures of adverse events, we will conduct an analysis in which we assume that participants who took at least one dose of the assigned medication were at risk of those events. For dichotomous outcomes, we will conduct sensitivity analyses to evaluate how the results are affected by different assumptions about missing data.

For secondary (sensitivity) analysis, we will use the pattern-mixture approach accounting for the uncertainty due to missing data [3639]. We will calculate the adjusted treatment effects for each outcome for which results are reported without imputation, then synthesize adjusted and unadjusted treatment effects via standard meta-analysis across all studies. Adjusted treatment effects are related to the informative missingness defined as a ratio (or difference) between missing and observed treatment effects. We will implement this approach under various scenarios by considering a wide range of the informative missingness (that is, we will assume the missing and observed treatment effects are similar or different).

For individual participant data

For all trials for which we have IPD, we will attempt to replicate the analyses performed by the study authors. Since the time that most studies of gabapentin for neuropathic pain and quetiapine for bipolar depression were conducted, researchers have developed new techniques to deal with missing outcome measures. We will use current best practices for handling missing data to determine if reanalysis of individual participant data following current best practice might lead to conclusions that differ from those of the original analyses.

Missing items in individual participant data

For outcome measures (i.e., questionnaires) with multiple items (i.e., questions), we will attempt to determine how sensitive the results might be to missing items. For each treatment group in each trial, we will describe the mean, maximum, and minimum number of missing items for each outcome measure where possible. We will impute missing items using methods described by the authors where possible. Additionally, we will impute missing items for standardized scales using standard coding techniques in the event that the authors did not impute missing items using standard methods.

Missing outcome measures in individual participant data

We anticipate that two analyses will allow us to replicate the handling of missing outcome measures (e.g., missing visits) for most of the analyses conducted by the authors, complete case analysis and last observation carried forward. In the presence of missing data, comparing imputed results (i.e., using best current methods, as described below) with a complete case analysis is important for evaluating the consequences of missing data and imputation.
  1. 1.

    Complete case analysis: for each outcome measure, we will include participants who completed a specific outcome measure at a specific time point of interest (e.g., when looking at the Short Form-36 (SF-36) at 6 weeks, we will include all individuals who completed the SF-36 at that time point). In addition, we will exclude participants who did not complete enough items to derive a summary score for the outcome measure at the specific time point.

     
  2. 2.

    Last observation carried forward (LOCF): for each outcome, we will conduct an analysis that includes all participants who completed that outcome measure at baseline assessment. If a participant did not complete an outcome measure at a given point in time, or if a participant did not complete enough of an outcome measure to derive a summary score for that measure, we will impute the outcome measure by carrying forward the last observation for that measure. Although single imputation is no longer recommended for handling missing data, this analysis will allow us to compare our results with the results calculated by the trialists [40, 41].

     

Best current methods

For the meta-analysis, we will estimate the treatment effects using multiple imputations to impute missing outcome measures and to account appropriately for the uncertainty because of missing data [40, 42]. We will conduct imputations separately for each of the trials. For each trial, we will impute both individual-level covariates measured at baseline (including participant characteristics and baseline measures of outcomes) and outcomes measured at every follow-up time point together using a “multiple imputation by chained equations” (MICE) approach, as implemented in the mi impute chained command in Stata [43]. MICE cycles through the variables, imputing each variable one at a time, by fitting a model of that variable as a function of the other variables in the imputation procedure. This process of imputing each variable one at a time is then repeated until the algorithm reaches convergence. The MICE approach allows each variable to be modeled according to its own distribution, and it can easily handle data complications such as variables with restricted ranges (e.g., age will be at least 18 years and cannot exceed 120 years). In general, we will use logistic regression for binary variables, multinomial logistic regression for categorical variables, Poisson regression for count variables, and a linear regression for continuous variables. For continuous variables that are not normally distributed, we will transform the data to be normally distributed to improve the fit of the imputation models, which will be linear regression for continuous variables. If we cannot transform a continuous variable to be normally distributed, we will use predictive mean matching [44].

The specific variables included in the imputation for each trial will include treatment group, demographic characteristics (e.g., sex and age), outcome measures at baseline and each recorded follow-up (e.g., daily pain score, present pain intensity, SF-36, SF-MPQ), as well as other clinical measures that are available across studies (e.g., heart rate, blood pressure). We will not include race in the model. In the case of convergence problems with MICE resulting from the number of variables or collinearity, we will use stepwise selection models to assist with the imputations. If there is no computational limit, we will create 100 imputations for each trial to achieve high efficiency [45]. We will estimate the treatment effect within each imputed dataset separately, and then we will combine the estimates for each trial using the standard multiple imputation combining rules [42]. Then, we will synthesize the combined treatment effect estimates across all studies as described below (“Data synthesis“ section).

Imputing all measures together as we will do (i.e., both baseline and outcome variables) utilizes all data available for each participant. For example, for a participant with just one missing follow-up time point, we will impute that missing value using information from observed values for that participant at other time points. In addition, it is seen as the best practice for imputation in clinical trials to include outcome measures with all the covariates in the imputation process, as we are planning to do here (and as described above) [46, 47].

Although White et al. recommend restricting analysis to individuals reporting outcome measurement after imputing covariates and outcomes together, we will estimate treatment effects using all individuals, including those with imputed outcomes [48]. In our IPD meta-analysis, restricting to individuals with observed outcomes would result in the same treatment effect estimates as obtained under the complete case analysis.

Sensitivity analyses

  1. 1.

    Pooled imputation: we will conduct a sensitivity analysis, modified from our MICE approach. In this sensitivity analysis, we will use a pooled imputation model in which we implement the multiple imputation procedure across all trials simultaneously. This imputation model will include outcome measures and other variables available for individual participants in at least three (50 %) of the studies for which we have IPD for gabapentin or in both of the studies for which we have IPD for quetiapine. For each review, the imputation will be carried out using a merged dataset that combines multiple studies. This imputation procedure will also include study indicators in the imputation models to account for heterogeneity across studies. One limitation of this approach is that it will not be helpful if reported covariates substantially differ across trials

     
  2. 2.

    Imputation with non-ignorable missingness: multiple imputation, as implemented following the procedures described above (“Best current methods“ section), assumes that missingness depends only on the observed data (i.e., missing at random) and that the missingness is not related to variables that were not measured. As a secondary sensitivity analysis, we will consider non-ignorable missingness (missing not at random (MNAR)), which assumes that missingness depends on both observed and unobserved data. To do this, we will model the outcomes and missingness pattern jointly using selection or pattern-mixture model techniques [42]. Using this model, we will estimate the treatment effect in each study, and we will conduct a meta-analysis to combine the estimated treatment effects across all studies as described below (“Data synthesis“ section). Furthermore, we will conduc simulation studies to investigate the impact of different missingness mechanisms with various missingness rates on treatment effect estimation in meta-analyses.

     

Assessment of heterogeneity

To assess clinical and methodological heterogeneity, we will present the characteristics of studies in tables and describe the similarity of participants, interventions, outcomes, and methods across studies.

To assess statistical heterogeneity, we will:
  1. 1.

    Visually inspect forest plots to see if the confidence intervals of individual studies have poor overlap—a rough indication of statistical heterogeneity;

     
  2. 2)

    Calculate the I2 statistic, which describes the percentage of observed heterogeneity that would not be expected by chance [49]. By convention, we will describe an analysis as having substantial statistical heterogeneity if its I2 statistic is greater than 50 %; and

     
  3. 3.

    Calculate tau, which captures the amount of heterogeneity on the same scale and unit of the outcome measure.

     

Data synthesis

For dichotomous outcomes (other than spontaneously reported adverse events), we will calculate risk ratios (RR) within studies and the summary risk ratio across studies [34]. For continuous outcomes measured on the same scale in all trials, we will calculate the weighted mean difference (WMD). For continuous outcomes measured on more than one scale, we will calculate the standardized mean difference (Hedges g). In studies reporting more than one measure of a domain, the most common measure across studies will be selected for analysis to minimize methodological heterogeneity. If some studies do not include the most common measure, we will select the most similar measure (rather than average treatment effects within studies).

In meta-analyses that include both aggregate and individual participant data, we will conduct two-stage meta-analyses [50, 51]. Specifically, individual participant data will be analyzed for each trial and then the results will be combined with aggregate data across all studies, assuming that aggregate and individual participant data estimate the same treatment effects [14, 52]. This enables us to borrow information from both levels of data and is the best use of all existing data.

All meta-analyses will be calculated using random-effects and reported with 95 % confidence intervals (CI) [5355].

Spontaneously reported adverse events will be described in tables, but these will not be analyzed statistically. We will record the number of events reported in the test (i.e., gabapentin or quetiapine) and comparator (e.g., placebo) groups of every trial report, and we will report the total number of events for all test and comparator groups for each data source. Where possible, we will sum individual participant data at the “Preferred term” level and above using the COSTART and MedDRA classification systems (see “Spontaneously reported adverse events“), and we will record data from other sources as they were reported.

Study design, participant characteristics, and treatment characteristics may affect results, so we will conduct a priori subgroup analyses to examine moderators using aggregate data or individual participant data where possible [56]. We will investigate differences between subgroups using a test for interaction (p < 0.1 considered relevant). Residual heterogeneity will be quantified using I2, which we will describe as substantial if I2 is greater than 50 %.

Comparing multiple sources

For pre-specified outcomes, we will produce a table showing the results from each study, which will be combined in an overall analysis including results from each data source, including the following:
  1. 1.

    Short reports (e.g., conference abstracts and posters);

     
  2. 2.

    Peer-reviewed journal articles (about one or more trials);

     
  3. 3.

    Summary data posted on trial registries;

     
  4. 4.

    Dissertations (e.g., masters or doctoral theses);

     
  5. 5.

    Unpublished manuscripts and reports (e.g., reports to funders, clinical study reports);

     
  6. 6.

    Information sent to regulators (e.g., data sent to FDA); and

     
  7. 7.

    Individual participant data.

     
We will examine if the results show evidence of reporting bias by comparing multiple data sources for studies associated with more than one report. For studies with both sources of data, we will compare the following:
  1. 1.

    Results from individual participant data compared with CSRs; and

     
  2. 2.

    Results from CSRs compared with published results.

     
We will then conduct a series of meta-analyses to explore the impact of multiple data sources on the overall results. We will analyze these results by sequentially adding or replacing data as follows:
  1. 1.

    Including only results from short reports (e.g., conference abstracts and posters);

     
  2. 2.

    Replacing data from short reports (step 1) with data from publications in peer-reviewed journals and adding data from studies reported in peer-reviewed publications but not short reports;

     
  3. 3.

    Replacing data from publications (step 2) using summary data obtained from the authors or manufacturers, regulators, or trial registries for the trials included above;

     
  4. 4.

    Adding data (to step 3) from unpublished trials using data obtained from regulators, or trial registries;

     
  5. 5.

    Adding or replacing data (from step 4) for unpublished trials with aggregate data obtained from the authors or manufacturers (e.g., clinical study reports);

     
  6. 6.

    Replacing the best available aggregate data for all studies (step 5) with individual participant data where available.

     

If more than one report of a particular type is available for a trial (e.g., several peer-reviewed publications), we will include data from all of them in the meta-analyses. We will note discrepancies where they exist, and we will analyze results from the main report of a trial if the main report and other reports are discrepant. If reports include data for more than one trial, we will include data reported separately for each trial; we will not include combined results for these analyses unless no other estimates are available for the included studies.

To compare studies that have only been reported in short reports (e.g., a conference abstract or poster) with studies that have been reported in greater detail, we will conduct one further step in this sequence:
  1. 7.

    Removing short reports (e.g., conference abstracts and posters).

     

We will investigate if the results for published and unpublished studies differ using the best available data for each.

By analyzing results in this sequence, we hope to identify and to quantify differences that are attributable to (1) information about additional outcomes examined (selective outcome reporting), (2) information from more than one source for a data item (competing information), and (3) information about previously hidden trials (publication bias).

For the main outcome in each review, we will explore the distribution of possible effects by calculating all combinations of reports for each outcome and reporting the range of observed means. We will explore the extent to which these estimates are influenced by the inclusion or exclusion of particular reports, paying particular attention to deviations from the mean that represent clinically important differences in the outcomes under investigation.

Criteria for selecting studies of gebapentin for neuropathic pain in adults (CRD42015014037)

Background

Peripheral neuropathy occurs when there is damage to the peripheral nervous system, the array of nerves that transmit information from the central nervous system (brain and spinal cord) to other parts of the body. There are hundreds of types of peripheral neuropathy, and the ways that individuals are affected (impaired function and symptoms) depends on the type of damage. Painful, chronic neuropathies can be caused by trauma, systemic diseases (e.g., diabetes, kidney disorders, cancer, hormone imbalances, and vitamin deficiencies), infections, immune disorders, chemotherapy, and other conditions. Between 3 and 10 % of the population may be living with painful neuropathy, and painful neuropathy affects 30 to 50 % of people with diabetes in particular [57]. Painful neuropathy and its sequelae, such as loss of sleep, result in reduced quality of life and high health care costs. For example, the cost of diabetic peripheral neuropathy has been estimated to be US$5 to US$14 billion, which accounts for up to 27 % of the direct medical costs associated with diabetes [58].

Gabapentin (Neurontin®) was approved by the FDA for the treatment of epilepsy in 1993, and it was approved in 2002 for the treatment of post-herpetic neuralgia (i.e., residual pain in people who have had shingles). It is used “off-label” for a variety of symptoms, including neuropathic and other types of pain.

Types of participants

Studies with adults (18 years and older) with neuropathic pain will be included without restriction by setting (e.g., hospital or outpatient) or comorbidity, except that participants requiring ventilator support will be excluded because effects may not be comparable to ambulatory populations. Studies including people with neuropathic pain as well as other conditions will be included if disaggregated data are available (in a report or from the authors) such that outcomes can be extracted separately for people with neuropathic pain (i.e., either individual participant data or aggregate data).

We will include participants considered to have neuropathic pain secondary to one or more of the following underlying conditions:
  • Cancer (malignancy or chemotherapy induced);

  • Central stroke;

  • Complex regional pain syndrome;

  • Diabetes mellitus;

  • Guillain-Barré syndrome;

  • Herpes zoster infection;

  • HIV infection;

  • Multiple sclerosis;

  • Nerve compression or entrapment including carpal tunnel syndrome and vertebral disc prolapse;

  • Phantom limb pain;

  • Radicular pain, including radiculopathy associated with spinal stenosis;

  • Spinal cord injury;

  • Trauma; and

  • Trigeminal neuralgia

We will exclude participants considered to have pain resulting from the following conditions:
  • Chronic low back pain other than radicular pain;

  • Chronic pelvic pain (which is multifactorial in etiology and not solely of neuropathic origin);

  • Fibromyalgia (there is no consensus on whether this is neuropathic pain);

  • Lyme borreliosis;

  • Migraine;

  • Osteoarthritis (pain is considered to be nociceptive in nature);

  • Pre- or post-operative acute pain (e.g., following thoracotomy or spinal fusion) or pain following vaginal delivery;

  • Restless leg syndrome;

  • Spinal stenosis without radiculopathy.

Types of interventions

We will include trials of oral gabapentin alone, or oral gabapentin in combination with another medication, with a daily dose of 300 mg gabapentin or above. Studies in which the dose of gabapentin was escalated until pain relief was achieved will be eligible. Studies of gabapentin enacarbil, a prodrug, will be excluded.

Outcomes for sequential meta-analysis

Following consultation with patients and clinicians, a review of existing trials and recommendations from the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) group [22], the following key outcomes were selected and will be extracted from each report and analyzed in a sequential meta-analysis to compare combined effects for meta-analyses using different data sources. Other outcomes will be extracted but not meta-analyzed as described below.
  1. 1.
    Pain intensity: severity of daily pain as measured by any pain scale or instrument. This is often assessed using a scale with values from 0 (“no pain”) to 10 (“worst possible pain”), completed daily. Depending on how pain was measured and reported in the source documents, we will analyze it as a continuous outcome or a categorical outcome.
    1. (a)
      Improvement in pain
      1. i.

        50 % improvement: proportion of participants in each group with ≥50 % reduction in mean daily pain intensity for a period of time (e.g., 1 week) prior to treatment compared with most recent period of time or a functionally similar definition.

         
      2. ii.

        30 % improvement: proportion of participants in each group with ≥30 % reduction in mean daily pain intensity for a period of time (e.g., 1 week) prior to treatment compared with most recent period of time or a functionally similar definition.

         
       
    2. b.

      Change in pain intensity: mean change in daily rating for a period of time (e.g., 1 week) prior to treatment compared with mean daily rating for most recent week or a functionally similar definition.

       
    3. c.

      Patient global impression of change: proportion of participants in each group reporting “very much improved“ or “much improved”

       
    4. d.

      Clinician global impression of change (CGIC or CGI): proportion of participants “very much improved” or “much improved”

       
     
  2. 2.

    Pain interference: the extent to which pain prevents normal functioning as measured by any pain scale or instrument (e.g., the Multidimensional Pain Inventory Interference Scale). We will analyze pain interference as a continuous outcome such as mean change in interference for a period of time (e.g., 1 week) prior to treatment compared with mean daily rating for the most recent period of time or a functionally similar definition.

     
  3. 3.

    Sleep disturbance: difficulty sleeping as measured by any scale or instrument. This is often assessed using a numeric scale, completed daily, describing how pain interfered with sleep during the last 24 h; scores may range from 0 (“does not interfere with sleep”) to 10 (“completely interferes with sleep”). We will analyze sleep disturbance as a continuous outcome such as mean change in daily rating for a period of time (e.g., 1 week) prior to treatment compared with mean daily rating for the most recent period of time or a functionally similar definition.

     
  4. 4.

    Quality of life (QOL): health-related quality of life as measured by any scale or instrument will be analyzed as a continuous outcome (e.g., change in QOL assessed using the mean difference from baseline measured on the Short Form-36). For measures with multiple subscales, we will include the total in the main analysis if possible.

     

In addition to those outcomes included in a sequential meta-analysis, spontaneously reported adverse events will be recorded and aggregated using the methods described above.

Outcomes for descriptive analysis

Definitions of the following outcomes, including the elements described above [27, 28], will be extracted from each report. We will use these data to evaluate the completeness of reporting and to identify differences in outcomes among reports.
  1. 1.
    Mood, including but not limited to:
    1. (a)

      Measures of depression (e.g., Beck Depression Inventory)

       
    2. (b)

      Measures of anxiety (e.g., State-Trait Anxiety Inventory)

       
    3. (c)

      Measures of overall mood state (e.g., profile of mood states) or psychiatric functioning (e.g., CORE-OM)

       
     
  2. 2.

    Lab tests (e.g., hemoglobin or glucose levels)

     
  3. 3.

    Evoked pain (e.g., allodynia or hyperalgesia)

     
  4. 4.

    Consumption of concurrent medication for pain (sometimes described as “escape” or “rescue” medication)

     
  5. 5.

    Time-to-event data related to any domains above

     

Baseline data to record

To describe the characteristics of participants in each study, we will extract demographic and clinical characteristics from each report. For the total sample, and for each group where possible, we will record the following:
  1. 1.

    Age

     
  2. 2.

    Sex

     
  3. 3.

    Weight

     
  4. 4.

    Self-reported race/ethnicity (percentage of non-white)

     
  5. 5.

    Drug and alcohol use

     
  6. 6.

    Study location (country and state or county)

     
  7. 7.

    Previous response or non-response to medication (e.g., gabapentin, other pain medications)

     
  8. 8.

    Pain condition

     
  9. 9.

    Duration of pain (i.e., time since onset of the condition)

     
  10. 10.

    History of anxiety or depressive disorder

     

Subgroup analysis and investigation of heterogeneity

We will summarize the characteristics of included studies and describe potential sources of clinical and methodological heterogeneity among them, and their potential influence on results.

We will also conduct the following subgroup analyses:
  1. 1.

    Sex (men versus women)

     
  2. 2.

    Pain condition (post-herpetic pain; diabetic neuropathy; other types of pain)

     
  3. 3.

    Daily dose (variable dose studies; fixed doses of 300–900, 901–1800, 1801–2700, and 2701 mg or more)

     

Searching for published literature

Databases and trial registries will be searched using the terms in Appendix 1.

Criteria for selecting studies of quetiapine for bipolar depression in adults (CRD42015014038)

Background

Bipolar disorder has a lifetime prevalence of 1–4 % [59, 60]. It is characterized by episodes of depression and at least one episode of mania or hypomania [61]. Work, family life, and social life are impaired significantly by depressive episodes [62, 63]. Because of their severity and frequency, depressive episodes account for three times more time spent with disability than manic episodes [61, 64, 65]. People with bipolar disorder are at increased risk of suicide compared with the general population and compared with people who have other mental health problems [66, 67].

Antipsychotics were developed for the treatment of acute psychotic episodes, including bipolar mania, for which there is evidence of short-term efficacy [68]. Quetiapine (Seroquel®) is an antipsychotic that is derived from dibenzothiazepine. It acts as an antagonist at serotonergic 5-HT2 receptors and dopaminergic D2 receptors in the central nervous system, but the method by which it might function as an antidepressant remains unclear [69, 70]. It is currently recommended as a first-line choice for treatment of acute bipolar depression by existing guidelines [71, 72], although it is associated with outcomes that are undesirable to patients including daytime sleepiness, cognitive impairment, loss of libido, and rapid weight gain. Quetiapine and other antipsychotics are also associated with serious adverse events, including cardiac and metabolic effects and extrapyramidal symptoms [73, 74].

Types of participants

Studies that include adults (18 years and older) with a current episode of depression will be included without restriction by setting (e.g., hospital or outpatient). Participants must have been diagnosed with bipolar disorder (type I or II) using DSM-III, DSM-IV, DSM-V, ICD-9, or ICD-10 criteria or an equivalent structured diagnostic interview. Studies that included only participants described as having the “rapid cycling” subtype will be excluded.

Studies including participants with other disorders (e.g., major depressive disorder and other serious mental illnesses such as schizophrenia spectrum disorders or substance abuse) will be included if disaggregated data are available (in a report or from the authors) such that outcomes can be extracted separately for people with bipolar disorder (i.e., either individual participant data or aggregate data).

Studies including only participants with both bipolar disorder and comorbid disorders (e.g., anxiety or substance misuse) will be included.

Types of interventions

We will include studies comparing oral quetiapine (including extended release) with a daily dose of 100 mg or above. Studies of norquetiapine, a metabolite of quetiapine, will be excluded.

Outcomes for sequential meta-analysis

Following consultation with patients and clinicians and a review of existing trials, the following key outcomes were selected and will be extracted from each report and analyzed in a sequential meta-analysis to compare combined effects for meta-analyses using different data sources. Other outcomes will be extracted but not meta-analyzed as described below.
  1. 1.
    Depression: symptoms of depression as measured by any scale or instrument. Depending on how depression was measured and reported in the source documents, we will analyze it as a continuous outcome or a categorical outcome.
    1. (a)

      Improvement in symptoms (e.g., proportion of participants reporting ≥50 % reduction in depression as measured using the Hamilton Rating Scale for Depression (HAM-D) or Montgomery-Åsberg Depression Rating Scale (MADRS))

       
    2. (b)

      Change in symptoms of depression (e.g., mean difference from baseline on the HAM-D, MADRS, or another depression rating scale)

       
     
  2. 2.

    Functioning: ability to participate in social, occupational, and family life as measured by any scale or instrument will be analyzed as a continuous outcome (e.g., change from baseline on the Global Assessment of Functioning scale).

     
  3. 3.

    Quality of life: health-related quality of life as measured by any scale or instrument will be analyzed as a continuous outcome (e.g., change in QOL assessed using the mean difference from baseline measured on the Short Form-36). For measures with multiple subscales, we will include the total in the main analysis if possible.

     
  4. 4.
    Anxiety: symptoms of anxiety as measured by any scale or instrument. Depending on how anxiety was measured and reported in the source documents, we will analyze it as a continuous outcome or a categorical outcome.
    1. (a)

      Improvement in symptoms (e.g., proportion of participants reporting ≥50 % reduction in anxiety as measured using the Hamilton Rating Scale for Anxiety)

       
    2. (b)

      Change in symptoms of anxiety (e.g., mean difference from baseline on the HAM-A or another anxiety rating scale)

       
     
  5. 5.
    Sleep
    1. (a)

      Proportion of participants using sleep medication

       
    2. (b)

      Change in sleep (e.g., mean difference from baseline on the Pittsburgh Sleep Quality Index, HAM-D insomnia items, or another measure of sleep)

       
     
  6. 6.
    Weight gain (we will combine measures of weight and body mass index because the height of adults is not expected to change during clinical trials.)
    1. (a)

      Measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time

       
    2. (b)

      Measured categorically such as proportion of participants gaining 2 or 5 % of their baseline weight

       
     
  7. 7.

    Psychiatric hospitalization (e.g., proportion of participants admitted to hospital)

     
  8. 8.
    Suicide
    1. (a)

      Proportion of participants completing suicide

       
    2. (b)

      Proportion of participants attempting suicide

       
    3. (c)

      Proportion of participants with suicidal intent or suicidal ideation

       
     

In addition to those outcomes included in a sequential meta-analysis, spontaneously reported adverse events will be recorded and aggregated using the methods described above.

Outcomes for descriptive analysis

This review focuses on the use of quetiapine for acute episodes, so adverse events associated with long-term use might not be observed. We do not expect to find evidence about outcomes like cataracts that develop over a period longer than the duration of an acute depressive episode. In acute treatment trials, blood tests and vital signs may be monitored to identify increased risk of adverse events, including serious adverse events.

Definitions of the following outcomes, including the elements described above [27, 28], will be extracted from each report. We will use these data to evaluate the completeness of reporting and to identify differences in outcomes among reports.
  1. 1.
    Metabolic effects
    1. (a)

      Proportion of participants with new diagnosis of diabetes mellitus type II

       
    2. (b)
      Waist circumference
      1. i.

        Measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time

         
      2. ii.

        Measured categorically such as proportion of participants with increase ≥5 cm

         
       
    3. (c)
      Fasting glucose
      1. i.

        Measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time

         
      2. ii.

        Measured categorically, e.g., proportion of participants with fasting glucose ≥100 mg/dL (hyperglycemia) and proportion of participants with fasting glucose ≥126 mg/dL

         
       
     
  2. 2.
    Endocrine effects
    1. (a)

      Proportion of participants reporting loss of interest in sex or sexual dysfunction (e.g., impotence, anorgasmia)

       
    2. (b)

      Proportion of women reporting irregular or missed menstrual cycle

       
    3. (c)

      Proportion of men reporting gynecomastia (swelling of the breast tissue)

       
    4. (d)

      Serum prolactin levels measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time

       
     
  3. 3.
    Extrapyramidal symptoms (tardive dyskinesia, dystonia, akathisia)
    1. (a)

      Incident cases: proportion of participants experiencing extrapyramidal symptoms (e.g., spasms, tremor, or involuntary movement), who did not have symptoms at baseline. For example, proportion of participants exceeding a total score of 3 on the Modified Simpson-Angus Scale (MSAS) [75].

       
    2. (b)

      Measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time on the Involuntary Movement Scale, Condensed User’s Scale, Simpson-Angus Scale, or Barnes Akathesia Rating Scale.

       
     
  4. 4.
    Cardiovascular effects
    1. (a)

      Mortality because of cardiac event

       
    2. (b)

      Incidence of myocardial infarction

       
    3. (c)
      QTc interval prolongation
      1. i.

        Measured on a continuous scale, e.g., mean change from baseline or a value-at-a-time

         
      2. ii.

        Measured categorically, e.g., proportion of participants with a prolonged QTc interval. Where possible, we will define this as ≥450 ms in men and ≥ 460 ms in women [76]

         
       
    4. (d)
      Blood pressure
      1. i.

        Mean change from baseline or value-at-a-time

         
      2. ii.

        Incidence of orthostatic hypotension (e.g., decrease in systolic BP ≥20 mm Hg or decrease in diastolic BP ≥10 mm Hg within 3 min after standing from sitting/supine position)

         
       
     
  5. 5.
    Cholesterol
    1. (a)

      Triglyerides measured continuously, e.g., mean change from baseline or a value-at-a-time

       
    2. (b)

      Triglyerides measured categorically, e.g., proportion of participants with ≥150 mg/dL and proportion of participants with ≥200 mg/dL

       
    3. (c)

      Total cholesterol measured continuously, e.g., mean change from baseline or a value-at-a-time

       
    4. (d)

      Total cholesterol measured categorically, e.g., proportion of participants with ≥200 mg/dL

       
     
  6. 6.
    Immune effects
    1. (a)

      Incidence of infection

       
    2. (b)

      Incidence of low neutrophil count, e.g., absolute neutrophil count (ANC) <100/ml)

       
     
  7. 7.
    Mania: symptoms of mania and hypomania as measured by any scale or instrument. Depending on how mania was measured and reported in the source documents, we will analyze it as a continuous outcome or a categorical outcome.
    1. (a)

      Worsening of symptoms, e.g., proportion of participants scoring ≥16 or ≥20 on the Young Mania Rating Scale (YMRS) [77, 78]

       
    2. (b)

      Change in symptoms of mania, e.g., mean difference from baseline on the YMRS or another depression rating scale

       
     
  8. 8.

    Neuroleptic malignant syndrome (an adverse reaction characterized by fever, muscle rigidity, and cognitive and autonomic abnormalities)

     
  9. 9.

    Time-to-event data related to any domains above

     

Baseline data to record

To describe the characteristics of participants in each study, we will extract demographic and clinical characteristics from each report. For the total sample, and for each group where possible, we will record the following:
  1. 1.

    Age

     
  2. 2.

    Sex

     
  3. 3.

    Weight

     
  4. 4.

    Self-reported race/ethnicity (percentage of non-white)

     
  5. 5.

    Drug and alcohol use

     
  6. 6.

    Study location (country and state or county)

     
  7. 7.

    Previous response or non-response to medication (e.g., quetiapine, other antipsychotics)

     
  8. 8.

    Concurrent psychiatric medication use (which will be described individually and in classes, such as antipsychotics, anticonvulsants, selective serotonin reuptake inhibitors, tricyclics, MAOIs, and lithium)

     
  9. 9.

    Duration of current episode (i.e., time since onset)

     
  10. 10.

    Number of mood episodes in the last year

     
  11. 11.

    Comorbid psychiatric conditions (e.g., proportion of people with an anxiety disorder, substance use disorder, personality disorder, or other mood disorder)

     

Subgroup analysis and investigation of heterogeneity

We will summarize the characteristics of included studies and describe potential sources of clinical and methodological heterogeneity between them and their potential influence on results.

We will conduct the following subgroup analyses:
  1. 1.

    Sex (men versus women)

     
  2. 2.

    Type of bipolar disorder (type I compared with type II)

     
  3. 3.

    Use of other concurrent medication (e.g., quetiapine as adjunct compared with quetiapine monotherapy)

     
  4. 4.

    Daily dose (variable dose studies; fixed doses of 100–300, 301–600, 601–900, and over 900 mg)

     

Searching for published literature

Databases and trial registries will be searched using the terms in Appendix 2.

Discussion

This study is now underway. We have identified outcomes for both reviews, conducted literature searches, and begin data extraction.

Protocol amendments

If this protocol is amended, we will record a description and the date of each amendment and we will describe these changes with the final report.

Abbreviations

ANC: 

absolute neutrophil count

CENTRAL: 

Cochrane central register of controlled trials

CGIC or CGI: 

clinician global impression of change

COSTART: 

coding symbols for thesaurus of adverse reaction terms

CSRs: 

clinical study reports

DIDA: 

drug industry document archive

EMA: 

European Medicines Agency

FDA: 

Food and Drug Administration

HAM-D: 

Hamilton rating scale for depression

HIPAA: 

Health Insurance Portability and Accountability Act

HLGT: 

high-level group term

HLT: 

high-level term

ICTRP: 

International Clinical Trials Registry Platform Search Portal

IMMPACT: 

initiative on methods, measurement, and pain assessment in clinical trials

IOM: 

Institute of Medicine

IPD: 

individual participant data

LLT: 

lowest level term

MADRS: 

Montgomery-Åsberg Depression Rating Scale

MedDRA: 

medical dictionary for regulatory activities

MHRA: 

Medicines and Healthcare products Regulatory Agency

MICE: 

multiple imputation by chained equations

MSAS: 

modified Simpson-Angus scale

PCORI: 

Patient-Centered Outcomes Research Institute

PMDA: 

Pharmaceuticals and Medical Devices Agency

PT: 

preferred term

QOL: 

quality of life

SOC: 

system organ class

SRDR: 

Systematic Review Data Repository

TGA: 

Therapeutic Goods Administration

WMD: 

weighted mean difference

YMRS: 

young mania rating scale

YODA: 

Yale University Open Data Access

Declarations

Sources of support

This study is funded since February 2014 by the Patient-Centered Outcomes Research Institute (PCORI; ME-1303-5785) (Kay Dickersin, Principal Investigator). PCORI was not involved in the design of the protocol and will not contribute to the interpretation of results.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

(1)
Center for Clinical Trials and Evidence Synthesis, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health
(2)
Department of Gastroenterology and Hepatology, Johns Hopkins School of Medicine
(3)
Department of Neurology, Johns Hopkins School of Medicine
(4)
Laboratory for Computational Sensing and Robotics, Johns Hopkins Whiting School of Engineering
(5)
Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health
(6)
Behavioral Biology, Center for Mind-Body Research, Johns Hopkins School of Medicine
(7)
Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine
(8)
The TMJ Association, Ltd.
(9)
Peabody Institute, Johns Hopkins University
(10)
Welch Medical Library, Johns Hopkins School of Medicine
(11)
Department of Mental Health, Johns Hopkins Bloomberg School of Public Health
(12)
University of Maryland School of Pharmacy
(13)
Center for Public Health and Human Rights, Johns Hopkins School of Medicine

References

  1. Scherer RW, Langenberg P, von Elm E. Full publication of results initially presented in abstracts. Cochrane Database Syst Rev. 2007(2):MR000005. doi:10.1002/14651858.MR000005.pub3.
  2. Song F, Parekh S, Hooper L, Loke YK, Ryder J, Sutton AJ, et al. Dissemination and publication of research findings: an updated review of related biases. Health Technol Assess. 2010;14(8):1–193. doi:10.3310/hta14080. iii, ix-xi.View ArticleGoogle Scholar
  3. Goodman S, Dickersin K. Metabias: a challenge for comparative effectiveness research. Ann Intern Med. 2011;155(1):61–2. doi:10.7326/0003-4819-155-1-201107050-00010.PubMedView ArticleGoogle Scholar
  4. Vedula SS, Li T, Dickersin K. Differences in reporting of analyses in internal company documents versus published trial reports: comparisons in industry-sponsored trials in off-label uses of gabapentin. PLoS Med. 2013;10(1):e1001378. doi:10.1371/journal.pmed.1001378.PubMedPubMed CentralView ArticleGoogle Scholar
  5. Chan AW, Song F, Vickers A, Jefferson T, Dickersin K, Gotzsche PC, et al. Increasing value and reducing waste: addressing inaccessible research. Lancet. 2014;383(9913):257–66. doi:10.1016/S0140-6736(13)62296-5.PubMedPubMed CentralView ArticleGoogle Scholar
  6. Lefebvre C, Glanville J, Wieland LS, Coles B, Weightman AL. Methodological developments in searching for studies for systematic reviews: past, present and future? Syst Rev. 2013;2:78. doi:10.1186/2046-4053-2-78.PubMedPubMed CentralView ArticleGoogle Scholar
  7. Egger M, Juni P, Bartlett C, Holenstein F, Sterne J. How important are comprehensive literature searches and the assessment of trial quality in systematic reviews? Empirical study. Health Technol Assess. 2003;7(1):1–76.PubMedGoogle Scholar
  8. Scherer RW, Huynh L, Ervin AM, Taylor J, Dickersin K. ClinicalTrials.gov registration can supplement information in abstracts for systematic reviews: a comparison study. BMC Med Res Methodol. 2013;13:79. doi:10.1186/1471-2288-13-79.PubMedPubMed CentralView ArticleGoogle Scholar
  9. Turner EH, Matthews AM, Linardatos E, Tell RA, Rosenthal R. Selective publication of antidepressant trials and its influence on apparent efficacy. N Engl J Med. 2008;358(3):252–60. doi:10.1056/NEJMsa065779.PubMedView ArticleGoogle Scholar
  10. Jefferson T, Jones MA, Doshi P, Del Mar CB, Hama R, Thompson MJ, et al. Neuraminidase inhibitors for preventing and treating influenza in healthy adults and children. Cochrane Database Syst Rev. 2014;4:CD008965. doi:10.1002/14651858.CD008965.pub4.PubMedGoogle Scholar
  11. Fu R, Selph S, McDonagh M, Peterson K, Tiwari A, Chou R, et al. Effectiveness and harms of recombinant human bone morphogenetic protein-2 in spine fusion: a systematic review and meta-analysis. Ann Intern Med. 2013;158(12):890–902. doi:10.7326/0003-4819-158-12-201306180-00006.PubMedView ArticleGoogle Scholar
  12. Simmonds MC, Brown JV, Heirs MK, Higgins JP, Mannion RJ, Rodgers MA, et al. Safety and effectiveness of recombinant human bone morphogenetic protein-2 for spinal fusion: a meta-analysis of individual-participant data. Ann Intern Med. 2013;158(12):877–89. doi:10.7326/0003-4819-158-12-201306180-00005.PubMedView ArticleGoogle Scholar
  13. Stewart LA, Parmar MK. Meta-analysis of the literature or of individual patient data: is there a difference? Lancet. 1993;341(8842):418–22.PubMedView ArticleGoogle Scholar
  14. Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. doi:10.1136/bmj.c221.PubMedView ArticleGoogle Scholar
  15. Rodgers MA, Brown JV, Heirs MK, Higgins JP, Mannion RJ, Simmonds MC, et al. Reporting of industry funded study outcome data: comparison of confidential and published data on the safety and effectiveness of rhBMP-2 for spinal fusion. BMJ. 2013;346:f3981. doi:10.1136/bmj.f3981.PubMedPubMed CentralView ArticleGoogle Scholar
  16. Institute of Medicine. Finding what works in health care: standards for systematic reviews. Washington, DC: The National Academies Press; 2011.Google Scholar
  17. Gabriel SE, Normand SL. Getting the methods right--the foundation of patient-centered outcomes research. N Engl J Med. 2012;367(9):787–90. doi:10.1056/NEJMp1207437.PubMedView ArticleGoogle Scholar
  18. Vedula SS, Bero L, Scherer RW, Dickersin K. Outcome reporting in industry-sponsored trials of gabapentin for off-label use. N Engl J Med. 2009;361(20):1963–71. doi:10.1056/NEJMsa0906126.PubMedView ArticleGoogle Scholar
  19. Turner EH, Knoepflmacher D, Shapley L. Publication bias in antipsychotic trials: an analysis of efficacy comparing the published literature to the US Food and Drug Administration database. PLoS Med. 2012;9(3):e1001189. doi:10.1371/journal.pmed.1001189.PubMedPubMed CentralView ArticleGoogle Scholar
  20. Elbourne DR, Altman DG, Higgins JP, Curtin F, Worthington HV, Vail A. Meta-analyses involving cross-over trials: methodological issues. Int J Epidemiol. 2002;31(1):140–9.PubMedView ArticleGoogle Scholar
  21. Li T, Yu T, Hawkins BS, Dickersin K. Design, analysis, and reporting of crossover trials for inclusion in a meta-analysis. PLoS One. 2015;10(8):e0133023. doi:10.1371/journal.pone.0133023.PubMedPubMed CentralView ArticleGoogle Scholar
  22. Turk DC, Dworkin RH, Allen RR, Bellamy N, Brandenburg N, Carr DB, et al. Core outcome domains for chronic pain clinical trials: IMMPACT recommendations. Pain. 2003;106(3):337–45.PubMedView ArticleGoogle Scholar
  23. Moore RA, Wiffen PJ, Derry S, Toelle T, Rice AS. Gabapentin for chronic neuropathic pain and fibromyalgia in adults. Cochrane Database Syst Rev. 2014;4:CD007938. doi:10.1002/14651858.CD007938.pub3.PubMedGoogle Scholar
  24. Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097. doi:10.1371/journal.pmed.1000097.PubMedPubMed CentralView ArticleGoogle Scholar
  25. Glanville J, Duffy S, McCool R, Varley D. Searching ClinicalTrials.gov and the International Clinical Trials Registry Platform to inform systematic reviews: what are the optimal search approaches? J Med Lib Assoc. 2014;102(3):177–83.View ArticleGoogle Scholar
  26. Turner EH. How to access and process FDA drug approval packages for use in research. BMJ. 2013;347:f5992. doi:10.1136/bmj.f5992.PubMedView ArticleGoogle Scholar
  27. Saldanha IJ, Dickersin K, Wang X, Li T. Outcomes in Cochrane systematic reviews addressing four common eye conditions: an evaluation of completeness and comparability. 2014.Google Scholar
  28. Zarin DA, Tse T, Williams RJ, Califf RM, Ide NC. The ClinicalTrials.gov results database—update and key issues. N Engl J Med. 2011;364(9):852–60. doi:10.1056/NEJMsa1012065.PubMedPubMed CentralView ArticleGoogle Scholar
  29. Loke YK, Price D, Herxheimer A. Systematic reviews of adverse effects: framework for a structured approach. BMC Med Res Methodol. 2007;7:32. doi:10.1186/1471-2288-7-32.PubMedPubMed CentralView ArticleGoogle Scholar
  30. FDA. COSTART: coding symbols for thesaurus of adverse reaction terms. Fourth Edition: Office of Management and Operations 1993.Google Scholar
  31. ICH. Introductory Guide MedDRA Version 17.0: International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) 2014.Google Scholar
  32. Schroll JB, Maund E, Gotzsche PC. Challenges in coding adverse events in clinical trials: a systematic review. PLoS One. 2012;7(7):e41174. doi:10.1371/journal.pone.0041174.PubMedPubMed CentralView ArticleGoogle Scholar
  33. Software OCR. ABBYY FineReader version 12. 2013.Google Scholar
  34. Higgins JP, Green S. Cochrane Handbook for Systematic Reviews of Interventions. Version 5.1.0 [updated March 2011]. The Cochrane Collaboration; 2011.Google Scholar
  35. Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(14):1355–60. doi:10.1056/NEJMsr1203730.PubMedPubMed CentralView ArticleGoogle Scholar
  36. Mavridis D, Chaimani A, Efthimiou O, Leucht S, Salanti G. Addressing missing outcome data in meta-analysis. Evid Based Ment Health. 2014;17(3):85–9. doi:10.1136/eb-2014-101900.PubMedView ArticleGoogle Scholar
  37. Mavridis D, White IR, Higgins JP, Cipriani A, Salanti G. Allowing for uncertainty due to missing continuous outcome data in pairwise and network meta-analysis. Stat Med. 2014;34(5):721–41. doi:10.1002/sim.6365.PubMedView ArticleGoogle Scholar
  38. Little RJA. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc. 1993;88:125–34.Google Scholar
  39. White IR, Welton NJ, Wood AM, Ades AE, Higgins JP. Allowing for uncertainty due to missing data in meta-analysis—part 2: hierarchical models. Stat Med. 2008;27(5):728–45. doi:10.1002/sim.3007.PubMedView ArticleGoogle Scholar
  40. Li T, Hutfless S, Scharfstein D, Daniels M, Hogan J, Little R, et al. Standards should be applied in the prevention and handling of missing data for patient-centered outcomes research: a systematic review and expert consensus. J Clin Epidemiol. 2014;67(1):15–32.PubMedPubMed CentralView ArticleGoogle Scholar
  41. PCORI. The PCORI Methodology Report. Appendix A: methodology standards. 2013.Google Scholar
  42. Little RJA, Rubin DB. Statistical analysis with missing data. 2nd ed. Hoboken: Wiley; 2002.Google Scholar
  43. StataCorp. Stata: Release 13. College Station, TX: StataCorp LP; 2013.Google Scholar
  44. Horton NJ, Lipsitz SR. Multiple imputation in practice: comparison of software packages for regression models with missing variables. The American Statistician, American Statistical Association. 2001;55:244–54.Google Scholar
  45. Graham JW, Olchowski AE, Gilreath TD. How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prev Sci. 2007;8:206–13.PubMedView ArticleGoogle Scholar
  46. Moons KG, Donders RA, Stijnen T, Harrell Jr FE. Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol. 2006;59(10):1092–101. doi:10.1016/j.jclinepi.2006.01.009.PubMedView ArticleGoogle Scholar
  47. Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, et al. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ. 2009;338:b2393. doi:10.1136/bmj.b2393.PubMedPubMed CentralView ArticleGoogle Scholar
  48. White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377–99. doi:10.1002/sim.4067.PubMedView ArticleGoogle Scholar
  49. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–60. doi:10.1136/bmj.327.7414.557.PubMedPubMed CentralView ArticleGoogle Scholar
  50. Riley RD, Simmonds MC, Look MP. Evidence synthesis combining individual patient data and aggregate data: a systematic review identified current practice and possible methods. J Clin Epidemiol. 2007;60(5):431–9. doi:10.1016/j.jclinepi.2006.09.009.PubMedView ArticleGoogle Scholar
  51. Riley RD, Lambert PC, Staessen JA, Wang J, Gueyffier F, Thijs L, et al. Meta-analysis of continuous outcomes combining individual patient data and aggregate dat. Stat Med. 2008;27(11):1870–93.PubMedView ArticleGoogle Scholar
  52. Stewart LA, Clarke MJ. Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane Working Group. Stat Med. 1995;14(19):2057–79.PubMedView ArticleGoogle Scholar
  53. Higgins JP, Whitehead A, Turner RM, Omar RZ, Thompson SG. Meta-analysis of continuous outcome data from individual patients. Stat Med. 2001;20(15):2219–41. doi:10.1002/sim.918.PubMedView ArticleGoogle Scholar
  54. Turner RM, Omar RZ, Yang M, Goldstein H, Thompson SG. A multilevel model framework for meta-analysis of clinical trials with binary outcomes. Stat Med. 2000;19(24):3417–32.PubMedView ArticleGoogle Scholar
  55. DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986;7(3):177–88.PubMedView ArticleGoogle Scholar
  56. Thompson SG, Higgins JP. Treating individuals 4: can meta-analysis help target interventions at individuals most likely to benefit? Lancet. 2005;365(9456):341–6. doi:10.1016/S0140-6736(05)17790-3.PubMedView ArticleGoogle Scholar
  57. Yawn BP, Wollan PC, Weingarten TN, Watson JC, Hooten WM, Melton 3rd LJ. The prevalence of neuropathic pain: clinical evaluation compared with screening tools in a community population. Pain Med. 2009;10(3):586–93. doi:10.1111/j.1526-4637.2009.00588.x.PubMedPubMed CentralView ArticleGoogle Scholar
  58. Gordois A, Scuffham P, Shearer A, Oglesby A, Tobian JA. The health care costs of diabetic peripheral neuropathy in the US. Diabetes Care. 2003;26(6):1790–5.PubMedView ArticleGoogle Scholar
  59. Kessler RC, Chiu WT, Demler O, Merikangas KR, Walters EE. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry. 2005;62(6):617–27. doi:10.1001/archpsyc.62.6.617.PubMedPubMed CentralView ArticleGoogle Scholar
  60. Merikangas KR, Jin R, He JP, Kessler RC, Lee S, Sampson NA, et al. Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch Gen Psychiatry. 2011;68(3):241–51. doi:10.1001/archgenpsychiatry.2011.12.PubMedPubMed CentralView ArticleGoogle Scholar
  61. Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, et al. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med. 2001;20(3 Suppl):21–35.PubMedView ArticleGoogle Scholar
  62. Calabrese JR, Hirschfeld RM, Frye MA, Reed ML. Impact of depressive symptoms compared with manic symptoms in bipolar disorder: results of a U.S. community-based sample. J Clin Psychiatry. 2004;65(11):1499–504.PubMedView ArticleGoogle Scholar
  63. Hirschfeld RM, Vornik LA. Recognition and diagnosis of bipolar disorder. J Clin Psychiatry. 2004;65 Suppl 15:5–9.PubMedGoogle Scholar
  64. Judd LL, Akiskal HS, Schettler PJ, Coryell W, Endicott J, Maser JD, et al. A prospective investigation of the natural history of the long-term weekly symptomatic status of bipolar II disorder. Arch Gen Psychiatry. 2003;60(3):261–9.PubMedView ArticleGoogle Scholar
  65. Judd LL, Akiskal HS, Schettler PJ, Endicott J, Maser J, Solomon DA, et al. The long-term natural history of the weekly symptomatic status of bipolar I disorder. Arch Gen Psychiatry. 2002;59(6):530–7.PubMedView ArticleGoogle Scholar
  66. Ilgen MA, Bohnert AS, Ignacio RV, McCarthy JF, Valenstein MM, Kim HM, et al. Psychiatric diagnoses and risk of suicide in veterans. Arch Gen Psychiatry. 2010;67(11):1152–8. doi:10.1001/archgenpsychiatry.2010.129.PubMedView ArticleGoogle Scholar
  67. Holma KM, Haukka J, Suominen K, Valtonen HM, Mantere O, Melartin TK et al. Differences in incidence of suicide attempts between bipolar I and II disorders and major depressive disorder. Bipolar Disorders. 2014:n/a-n/a. doi:10.1111/bdi.12195.
  68. Cipriani A, Barbui C, Salanti G, Rendell J, Brown R, Stockton S, et al. Comparative efficacy and acceptability of antimanic drugs in acute mania: a multiple-treatments meta-analysis. Lancet. 2011;378(9799):1306–15. doi:10.1016/S0140-6736(11)60873-8.PubMedView ArticleGoogle Scholar
  69. Jensen NH, Rodriguiz RM, Caron MG, Wetsel WC, Rothman RB, Roth BL. N-desalkylquetiapine, a potent norepinephrine reuptake inhibitor and partial 5-HT1A agonist, as a putative mediator of quetiapine's antidepressant activity. Neuropsychopharmacology. 2008;33(10):2303–12. doi:10.1038/sj.npp.1301646.PubMedView ArticleGoogle Scholar
  70. Sanford M, Keating G. Quetiapine: a review of its use in the management of bipolar depression. CNS Drugs. 2012;26(5):435–60.PubMedView ArticleGoogle Scholar
  71. Nivoli AM, Colom F, Murru A, Pacchiarotti I, Castro-Loli P, Gonzalez-Pinto A, et al. New treatment guidelines for acute bipolar depression: a systematic review. J Affect Disord. 2011;129(1–3):14–26. doi:10.1016/j.jad.2010.05.018.PubMedView ArticleGoogle Scholar
  72. NICE. Bipolar disorder: the management of bipolar disorder in adults, children and adolescents, in primary and secondary care. London: National Institute for Health and Care Excellence; 2014.Google Scholar
  73. Newcomer JW. Second-generation (atypical) antipsychotics and metabolic effects: a comprehensive literature review. CNS Drugs. 2005;19 Suppl 1:1–93.PubMedGoogle Scholar
  74. Ray WA, Chung CP, Murray KT, Hall K, Stein CM. Atypical antipsychotic drugs and the risk of sudden cardiac death. N Engl J Med. 2009;360(3):225–35. doi:10.1056/NEJMoa0806994.PubMedPubMed CentralView ArticleGoogle Scholar
  75. Hawley CJ, Finneberg N, Roberts AG, Baldwin D, Sahadevan A, Sharman V. The use of the Simpson Angus Scale for the assessment of movement disorder: a training guide. Int J Psych Clin Pract. 2003;7:249–57.View ArticleGoogle Scholar
  76. Rautaharju PM, Surawicz B. AHA/ACCF/HRS recommendations for the standardization and interpretation of the electrocardiogram: Part IV. Circulation. 2009;119:e241–e50.PubMedView ArticleGoogle Scholar
  77. Young RC, Biggs JT, Ziegler VE, Meyer DA. A rating scale for mania: reliability, validity and sensitivity. Br J Psychiatry. 1978;133:429–35.PubMedView ArticleGoogle Scholar
  78. Lukasiewicz M, Gerard S, Besnard A, Falissard B, Perrin E, Sapin H, et al. Young Mania Rating Scale: how to interpret the numbers? Determination of a severity threshold and of the minimal clinically significant difference in the EMBLEM cohort. Int J Methods Psychiatr Res. 2013;22(1):46–58. doi:10.1002/mpr.1379.PubMedView ArticleGoogle Scholar

Copyright

© Mayo-Wilson et al. 2015

Advertisement