Open Access

A surveillance system to assess the need for updating systematic reviews

  • Nadera Ahmadzai1Email author,
  • Sydne J Newberry2,
  • Margaret A Maglione2,
  • Alexander Tsertsvadze1,
  • Mohammed T Ansari1,
  • Susanne Hempel2,
  • Aneesa Motala2,
  • Sophia Tsouros1,
  • Jennifer J Schneider Chafen3,
  • Roberta Shanman2,
  • David Moher1 and
  • Paul G Shekelle2, 4
Systematic Reviews20132:104

DOI: 10.1186/2046-4053-2-104

Received: 24 June 2013

Accepted: 28 October 2013

Published: 14 November 2013

Abstract

Background

Systematic reviews (SRs) can become outdated as new evidence emerges over time. Organizations that produce SRs need a surveillance method to determine when reviews are likely to require updating. This report describes the development and initial results of a surveillance system to assess SRs produced by the Agency for Healthcare Research and Quality (AHRQ) Evidence-based Practice Center (EPC) Program.

Methods

Twenty-four SRs were assessed using existing methods that incorporate limited literature searches, expert opinion, and quantitative methods for the presence of signals triggering the need for updating. The system was designed to begin surveillance six months after the release of the original review, and thenceforth every six months for any review not classified as being a high priority for updating. The outcome of each round of surveillance was a classification of the SR as being low, medium or high priority for updating.

Results

Twenty-four SRs underwent surveillance at least once, and ten underwent surveillance a second time during the 18 months of the program. Two SRs were classified as high, five as medium, and 17 as low priority for updating. The time lapse between the searches conducted for the original reports and the updated searches (search time lapse - STL) ranged from 11 months to 62 months: The STL for the high priority reports were 29 months and 54 months; those for medium priority reports ranged from 19 to 62 months; and those for low priority reports ranged from 11 to 33 months. Neither the STL nor the number of new relevant articles was perfectly associated with a signal for updating. Challenges of implementing the surveillance system included determining what constituted the actual conclusions of an SR that required assessing; and sometimes poor response rates of experts.

Conclusion

In this system of regular surveillance of 24 systematic reviews on a variety of clinical interventions produced by a leading organization, about 70% of reviews were determined to have a low priority for updating. Evidence suggests that the time period for surveillance is yearly rather than the six months used in this project.

Keywords

Systematic review Updating Surveillance

Background

Systematic reviews (SRs) on the effectiveness and safety of various health interventions are the basis for clinical practice guidelines, public and corporate policy, and clinical and consumer decision-making. These SRs provide systematically searched, collected, evaluated, and synthesized scientific evidence to objectively compare the effectiveness, benefits, and safety of different health interventions. The production of SRs is based on standardized, structured, and explicit methodological guidance. The SRs endeavor to focus on patient-relevant outcomes (for example, mortality, pain, quality of life, functional status, myocardial infarction) in addition to relevant intermediate surrogate outcome measures (for example, cholesterol levels, serum glucose levels, red blood cell count) [1].

Systematic reviews may be conducted by independent groups of researchers or by researchers associated with large organizations such as the Cochrane Collaboration; the United States Agency for Healthcare Research and Quality (AHRQ), which administers a group of Evidence-based Practice Centers (EPC) throughout North America; and the National Institute for Health and Clinical Excellence (NICE) in the UK [2]. A primary responsibility of these organizations is the conduct of systematic reviews, the results of which are often posted on their websites.

The inevitable - and rapid - accumulation of new research findings has raised concern among these organizations about how best to identify which reviews may be out of date and whether to sponsor an update or simply remove the outdated review from their websites. To date, organizations and initiatives (for example, Cochrane Collaboration, Drug Effectiveness Review Project (DERP)) have relied on time-based (for example, annual, biennial) periodic updating policies that have proven to be problematic in terms of feasibility and efficiency [35]. However several lines of evidence demonstrate that reviews become obsolete at different rates, suggesting that a system of regular surveillance might be a more effective way of identifying potentially out-of-date reviews. In 2006, the DERP implemented a strategy for assessing the need for updating systematic reviews of comparative effectiveness and safety of drug interventions evaluated in controlled clinical trials [6]. The DERP’s stakeholders need to make coverage decisions for new drugs, and therefore the appearance of a new drug is a strong signal for an update. However, not all SR users (for example guideline developers) might consider a new drug within an established class (such as a new statin or angiotensin receptor antagonist) as an indication of the need for an update. Furthermore, SRs may deal with non-pharmacologic interventions (for example, diagnostic screening) and include observational studies.

AHRQ supported a pilot study comparing different methods to assess signals for the need to update SRs and another study to assess an initial set of SRs that were considered Comparative Effectiveness Reviews (CERs) for the need to update. CERs are systematic reviews that aim to compare the benefit and harms of a range of options rather than only answering a narrow question on safety and effectiveness of a single therapy [2]. Based on these pilot studies, AHRQ supported the development of a surveillance system for regularly monitoring AHRQ’s portfolio of SRs. This article presents the results of the surveillance system covering June 2011 to November 2012.

Methods

The surveillance system - summary overview

Two EPCs (RAND, University of Ottawa) participated in the development of the surveillance system; a third EPC (ECRI) assisted in obtaining safety alerts). The RAND and Ottawa EPCs had independently developed methods to assess SRs for the need to update [7, 8]; a formal comparison of the two showed they produced similar results [9]. In developing and implementing the surveillance system, we operationalized a proposal made in our earlier CER surveillance report for what such a system would look like (see Figure 1). This article describes the surveillance assessment of 24 consecutive SRs conducted for the AHRQ Effective Health Care’s Comparative Effectiveness Review program [1033].
https://static-content.springer.com/image/art%3A10.1186%2F2046-4053-2-104/MediaObjects/13643_2013_Article_166_Fig1_HTML.jpg
Figure 1

The process of surveillance assessment for a systematic review (SR). Figure 1 portrays the overall process of surveillance assessment for an SR that mainly includes: 1) literature search, 2) contacting experts, and 3) obtaining safety alerts from various sources sent by ECRI (one of the AHRQ evidence-based centers). The number of hits identified by literature search would be transferred to Reference Manager database and then will be screened by: 1) title and abstract, and 2) full text. The data was extracted from the number of studies that were deemed eligible for inclusion. Next, the extracted data was assessed for identifying qualitative and quantitative signals. Then, the findings from literature, expert opinion, and safety alerts were collated and assessed for updating priority status (high, medium or low). If an SR was deemed as ‘high’ priority for assessment, it was referred to AHRQ for updating. If an SR was deemed as ‘medium’ or ‘high’ priority for updating, it was re-assessed six months after the completion of the first assessment.

The surveillance system was designed to conduct an assessment of a SR six months after its release and every six months thereafter until the assessment identified signals sufficient to classify it as ‘high priority’ for an update. Briefly, six months after release of a SR, we conduct abbreviated literature searches, using the strategy employed in the original SR, but limited to five general medicine journals and approximately five specialty journals specific to the topic of the SR and, with a few exceptions. Newly identified evidence relevant to the key questions and the original conclusions was abstracted, and pre-specified criteria were used to detect the presence of qualitative and/or quantitative signals for updating (EPC SRs are organized around a set of key questions, each of which might have multiple parts, resulting in the need for multiple conclusions) [7]. The method also incorporates expert opinion regarding the validity or currency of conclusions reached in the SR and government safety alerts relevant to the SR [8]. Based on a combination of the weight of the evidence, signals, and expert opinion, a determination was made regarding the need to update each conclusion for each key question, with the expectation that a change in conclusion may yield a change in clinical practice. That is, each key question (KQ)-specific conclusion within a SR was categorized as up-to-date, possibly out-of-date, probably out-of-date, or out-of-date [8]. Finally, based on: 1) the proportion of key questions whose conclusions were determined to require updating or the urgency to update a particular set of conclusions, and 2) the extent of outdatedness, a global assessment of priority status was assigned to updating the full report (high, medium, or low), and the results of the process were summarized in a brief report. SRs assigned a low or medium priority for updating were re-assessed six months later. Reports assigned a high priority for updating were not re-assessed. The decision to update or withdraw the report is made by AHRQ, who consider the availability of resources and other factors when making a final decision. Detailed methods of the surveillance process are presented as the following:

Abbreviated search, study selection, and data extraction

The ascertainment of updating signals relied on qualitative and/or quantitative criteria developed originally for the Ottawa method [7] and expert opinion as used in the RAND method [8]. For each SR, we conducted an abbreviated update search as described in previous publications [7, 9, 34]. We employed the strategies used in the original published SRs but limited the sources searched to five general medical journals (Annals of Internal Medicine, BMJ, JAMA, The Lancet, and New England Journal of Medicine) and approximately five topic-specific specialty journals (usually the journals that contributed the most evidence to the original report; (if a particular specialty journal was not catalogued in PubMed, we would search the more relevant database as well). These searches were conducted for a time period starting six months prior to the last date covered by the searches for the original SR (to minimize the number of relevant studies missed due to delayed publication) up to the present. We also assessed the eligibility of studies referenced by content experts (further detail on the experts in the following sections). After removing duplicates from identified records, one reviewer used the inclusion/exclusion criteria specified in the original SR to screen titles and abstracts and then full texts of potentially relevant records. For each included new study, one reviewer extracted relevant data on study characteristics (for example, design, sample size, follow-up duration), demographic factors for study participants (for example, age, sex, condition), treatment (for example, type, frequency, dose), outcome characteristics, and results into an evidence table.

Ascertainment of updating signals

To identify signals/triggers for updating, we applied qualitative and/or quantitative criteria [7] to the abstracted evidence for each conclusion in the original SR. For each conclusion, we first documented the absence of new evidence (that is, no new evidence or new evidence showing the same or similar conclusion as the original SR) or the presence of new evidence meeting the pre-defined criteria of signal(s) indicating a need for updating (Table 1).
Table 1

Criteria for determining that a conclusion is out-of-date

Ottawa’s label

Ottawa method

 

Qualitative criteria for potentially invalidating signals

A1

Opposing findings: a pivotal* trial or systematic review (or guidelines) including at least one new trial that characterized the treatment in terms opposite to those used earlier

A2

Substantial harm: a pivotal trial or systematic review (or guidelines) whose results called into question the use of the treatment based on evidence of harm or that did not proscribe use entirely but did potentially affect clinical decision-making

A3

A superior new treatment: a pivotal trial or systematic review (or guidelines) whose results identified another treatment as significantly superior to the one evaluated in the original review, based on efficacy or harm

 

Qualitative criteria for signals of major changes

A4

Important changes in effectiveness short of ‘opposing findings’

A5

Clinically important expansion of treatment

A6

Clinically important caveat

A7

Opposing findings from discordant meta-analysis or non-pivotal trial

 

Quantitative criteria signals of changes in evidence

B1

A change in statistical significance (from nonsignificant to significant)

B2

A change in relative effect size of at least 50 percent

RAND’s label

RAND method indications for the need for an update

1

Original conclusion is still valid and this portion of the original report does not need updating. This conclusion was reached if we found no new evidence or only confirmatory evidence and all responding experts assessed the CER conclusion as still valid, we classified the CER conclusion as still valid

2

Original conclusion is possibly out-of-date and this portion of the original report may need updating. This conclusion was reached if we found some new evidence that might change the CER conclusion, and/or a minority of responding experts assessed the CER conclusion as having new evidence that might change the conclusion, then we classified the CER conclusion as possibly out-of-date

3

Original conclusion is probably out-of-date and this portion of the original report may need updating. This conclusion was reached if we found substantial new evidence that might change the CER conclusion, and/or a majority of responding experts assessed the CER conclusion as having new evidence that might change the conclusion, then we classified the CER conclusion as probably out-of-date

4

Original conclusion is out-of-date. This conclusion was reached if we found new evidence that rendered the CER conclusion out-of-date or no longer applicable; we classified the CER conclusion as out-of-date. Recognizing that our literature searches were limited, we reserved this category only for situations where a limited search would produce prima facie evidence that a conclusion was out-of-date, such as the withdrawal of a drug or surgical device from the market, a black box warning from FDA, and so on

Abbreviation: CER comparative effectiveness review, FDA Food and Drug Administration.

*a pivotal trial is defined as trial that is published in one of the top five general medical journals or a trial whose sample size is at least triple that of the largest trial in the original systematic review.

Legend: Table 1 presents the criteria used to determine if a conclusion is out of date within an SR (here CER). Criteria A1 to B2 come from the Ottawa method and criteria 1 to 4 are based on the RAND method.

We then assessed whether new evidence provided or contributed to a qualitative or quantitative signal. One example of a qualitative signal might include finding a newly published pivotal trial with results opposite to that of the original SR with respect to an efficacy outcome (for example, effective versus ineffective or vice versa) or a harm (for example, a newly identified risk of harm that outweighs the previously observed benefits). The original definition of a pivotal trial was one published in one of the top five general medical journals or a trial whose sample size was at least triple that of the largest trial in the original SR [7]. For this application we made some adaptations to account for key questions for which observational studies were the study design of choice; namely we did not require new large cohort studies to have at least three times the number of participants as existing large cohorts. Other examples of qualitative signals included a superior new treatment (for example, a new treatment significantly more effective than one assessed in the SR); or a new population subgroup (that is, the treatment assessed in the SR has subsequently been tested on a new population). In contrast, new evidence generates a quantitative signal if its incorporation into a SR’s original meta-analysis changes a statistically non-significant pooled estimate into a statistically significant one or vice versa[7].

Clinical content experts

We identified and contacted two sets of clinical experts: a) those who had worked on the SR in question (for example, the project lead, clinical lead, members of the technical expert panel, and peer reviewers) and b) other clinical experts in the clinical content area who had not worked on the SR in question (for example, local or external subject matter experts). For each SR, we created a matrix that included each of the original key questions and a summary of each conclusion in the original report. Respondents were asked to provide their opinions on whether or not each conclusion was still valid. They were also asked to provide reference citations for any new studies they were aware of that might invalidate or otherwise alter the conclusion(s) as well as studies that were pertinent to the topic but might not address a particular conclusion directly (for example, studies of newer treatments that may have rendered the original treatments out-of-date). The responding experts were offered a small honorarium; reminders were sent to experts who did not initially respond.

Safety alerts

We examined safety and adverse event alerts relevant to each SR. This information was collected from MedWatch, the US Food and Drug Administration’s Safety Information and Adverse Event reporting system; the UK’s Medicines and Health Care Products Regulatory Agency (MHRA); and Health Canada.

Determination of updating status for SRs

The information on updating signals, expert opinion, and safety alerts was collated, summarized, and tabulated. Taking into consideration the totality of evidence, we used a set of decision rules/guidance originally used in the pilot studies [8, 9] to characterize any given KQ-related conclusion(s) as up-to-date, possibly out-of-date, probably out-of-date, or out-of- date. Based on the totality of these characterizations, each SR was assigned to high, medium, or low updating priority groups. The decision to assign a high priority was not based strictly on the proportion of conclusions determined to be probably or definitely out-of-date, but rather, was a global judgment informed by a set of guidelines; for example, one out-of-date conclusion that could result in harm or inferior treatment could give rise to a high priority for updating. The criteria for determining updating status are provided in Additional file 1.

For each of the SRs that underwent surveillance, we summarized our findings in a brief report. These reports are now posted on the AHRQ website along with the original SRs to which they refer.

Assessment of the findings across SRs

To gain a sense of how long it takes for SRs to go out-of-date, we assessed the proportion of the SRs that went through the surveillance process at least once that received a high or medium priority for updating as a function of the length of time since their publication and from the date of their latest searches.

Results

Sampling of SRs for assessment

Between June 2011 and November 2012, we assessed 24 SRs at least once. When we implemented the surveillance system, a backlog of SRs had accumulated and needed to be assessed. In addition, there was a 3- to 17- month lag between the completion of the original or update search and the release of the reports. Thus, there was a time span of 11 to 62 months from the completion of the original or update searches and the surveillance search (Table 2).
Table 2

Characteristics of 24 comparative effectiveness reviews (CERs) and their associated updating surveillance assessments

CER title; first author last name (publication date)

Latest search date for CER (across databases)

Number of included studies in CER (total or per KQ)

Period covered by surveillance assessment search

Number of new studies judged as relevant for inclusion in CER

(Journal publication, if available)

Time between search and report release

 

Time between original searches and update search

 

Comparative Effectiveness of Therapies for Clinically Localized Prostate Cancer; Wilt (February 2008) [22]

September 2007

436

January 2007 to March 2012

21

(Systematic review: association Between Hospital and Surgeon Radical Prostatectomy Volume and Patient Outcomes: A Systematic Review) [35]

5 months

 

54 months

 

Comparative Effectiveness of Medications to Reduce Risk of Primary Breast Cancer in Women; Nelson (September 2009) [11]

January 2009

13 (KQ1,KQ3)

January 2008 to July 2011

3

(Systematic review: comparative effectiveness of medications to reduce risk for primary breast cancer) [36]

8 months

70 (KQ2,KQ3)

31 months

 
  

24 (KQ4)

  
  

16 (KQ5)

  

Comparative Effectiveness of Core Needle Biopsy and Open Surgical Biopsy for Diagnosis of Breast Lesions; Bruening (December 2009) [10]

September 2009

107 (KQ1-2)

January 2008 to September 2011

19

(Systematic review: comparative effectiveness of core-needle and open surgical biopsy to diagnose breast lesions) [37]

3 months

NA (KQ3)

24 months

 

Effectiveness of Recombinant Human Growth Hormone (rhGH) in the Treatment of Patients with Cystic Fibrosis; Phung (October 2010) [12]

April 2010

26(KQ1-2, KQ4, KQ6-7); 50 (KQ3); 3(KQ5)

January 2010 to August 2011

16

(Systematic review: recombinant human growth hormone in the treatment of patients with cystic fibrosis) [38]

6 months

 

16 months

 

Therapies for Children with Autism Spectrum Disorders; Warren (April 2011) [13]

May 2010

159

January 2009 to October 2011

15

(Systematic review: a systematic review of early intensive intervention for autism spectrum disorders) [39]

11 months

 

17 months

 

Comparative Effectiveness of Traumatic Brain Injury and Depression;

June 2010

115

January 2010 to October 2011

29

Guillamondegui (April 2011) [15]

10 months

 

16 months

 

Pain Management Interventions for Hip Fracture; Abou-Setta (May 2011) [14]

December 2010

98

January 2008 to November 2011

1

(Systematic review: comparative effectiveness of pain management interventions for hip fracture: a systematic review) [40]

5 months

 

11 months

 

Diagnosis and Treatment of Obstructive Sleep Apnea in Adults; Balk (August 2011) [27]

September 2010

44 (KQ1); 1 (KQ2); 2 (KQ3); 11 (KQ4); 173 (KQ5); 6 (KQ6); 18 (KQ7)

January 2010 to April 2012

35

(Systematic review: auto-titrating versus fixed continuous positive airway pressure for the treatment of obstructive sleep apnea: a systematic review with meta-analyses) [41]

11 months

 

19 months

 

Nonpharmacologic Interventions for Treatment-Resistant Depression in Adults; Gaynes (September 2011) [30]

18 November 2010

64

January 2010 to March 2012

9

 

10 months

 

16 months

 

Attention Deficit Hyperactivity Disorder (ADHD): Effectiveness of Treatment in At-Risk Preschoolers; Long-term Effectiveness in All Ages; and Variability in Prevalence, Diagnosis, and Treatment; Charach (October 2011) [33]

31 May 2010

53 (KQ1); 76 (KQ2); NR (KQ3)

January 2010 to June 2012

17

 

17 months

 

25 months

 

Effectiveness of Early Diagnosis, Prevention, and Treatment of Clostridium difficile Infection; Butler (December 2011) [29]

June 2010 and for KQ3 May 2011

13 (KQ1); 36 (KQ2); 13 (KQ3); 40 (KQ4)

January 2010 to June 2012

7

(Systematic review: comparative effectiveness of Clostridium difficile treatments: a systematic review) [42]

7 to 18 months

 

24 months for KQ 1,2, 4

 
   

13 months for KQ3

 

Noncyclic Chronic Pelvic Pain Therapies for Women: Comparative Effectiveness; Andrews (January 2012) [28]

May 2011

23 (KQ1); 7 (KQ2); 0 (KQ3); 17 (KQ4); 0 (KQ5)

May 2011 to July 2012

2

(Systematic review: systematic review of therapies for noncyclic chronic pelvic pain in women) [43]

8 months

 

14 months

 

Chronic Kidney Disease Stages 1 to 3: Screening, Monitoring, and Treatment;

January 2011

110

January 2011 to August 2012

20

Fink (January 2012) [31]

12 months

 

19 months

 

(Systematic review: screening for, monitoring, and treatment of chronic kidney disease stages 1 to 3: a systematic review for the US Preventive Services Task Force and for an American College of Physicians Clinical Practice Guideline) [44]

    

First and second generation antipsychotics for children and young adults; Seida (February 2012) [32]

February 2011

81

January 2011 to August 2012

19

(Systematic review: antipsychotics for children and young adults: a comparative effectiveness review) [45]

12 months

 

18 months

 

Comparative Effectiveness of Management Strategies for Renal Artery Stenosis: 2007 Update; Balk (November 2007) [26]

23 April 2007

8

October 2006 to June 2012

7

 

7 months

 

62 months

 

Comparative Effectiveness of Radiofrequency Catheter Ablation for Atrial Fibrillation; IP (July 2009) [18]

December 2008

120

June 2008 to September 2011

33

(Systematic review: comparative effectiveness of radiofrequency catheter ablation for atrial fibrillation) [46]

7 months

 

35 months

 

Comparative Effectiveness of Lipid-Modifying Agents; Sharma (September 2009) [17]

May 2009

101

November 2008 to October 2011

20

(Systematic review: comparative effectiveness and harms of combination therapy and monotherapy for dyslipidemia) [47]

4 months

 

29 months

 

Comparative Effectiveness of Angiotensin Converting Enzyme Inhibitors or Angiotensin II Receptor Blockers Added to Standard Medical Therapy for Treating Stable Ischemic Heart Disease; Coleman (October 2009) [20]

February 2009

60

August 2008 to November 2011

12

(Systematic review: comparative effectiveness of angiotensin-converting enzyme inhibitors or angiotensin II-receptor blockers for ischemic heart disease) [48]

8 months

 

33 months

 

Comparative Effectiveness of In-Hospital Use of Recombinant Factor VIIa for Off-Label Indications versus Usual Care; Yank (May 2010) [16]

August 2009

74

February 2009 to January 2012

15

(Systematic review: benefits and harms of in-hospital use of recombinant factor VIIa for off-label indications) [49]

9 months

 

29 months

 

Comparative effectiveness and safety of radiotherapy treatments for head and neck cancer; Samson (May 2010) [19]

September 2009

108

March 2009 to August 2011

7

 

8 months

 

23 months

15

Comparative Effectiveness of Nonoperative and Operative Treatments for Rotator Cuff Tears; Sedia (July 2010) [21]

September 2009

137

March 2009 to January 2012

 

(Systematic review: nonoperative and operative treatments for rotator cuff tears) [50]

10 months

 

28 months

 

Comparative Effectiveness of Terbutaline Pump for the Prevention of Preterm Birth; Gaudet (September 2011) [23]

April, 2011

14

October 2010 to March 2012

0

(Systematic review: effectiveness of Terbutaline Pump for the Prevention of Preterm Birth. A Systematic Review and Meta-Analysis) [51]

5 months

 

11 months

 

Self-Measured Blood Pressure Monitoring: Comparative Effectiveness; Uhlig (January 2012) [24]

19 July 2011

48 (KQ1-2); 1( KQ5)

January 2011 to August 2012

1

 

6 months

0 (KQ 3–4)

13 months

 

Hematopoietic Stem-Cell Transplantation in the Pediatric Population; Ratko (February 2012) [25]

17 August 2011

251

February 2011 to September 2012

0

 

6 months

 

13 months

 

Abbreviations: CER Comparative Effectiveness Review, KQ key question, NR not reported.

Legend: Columns 1 to 3 present characteristics of CERs:

Column 1: title of the CER, first author’s last name, and title of the journal publication for the same CER.

Column 2: the date that last search was executed for the original CER (for instance, 17 August 2011 refers to the last literature search date carried out for CER), and the time lag between the last search date and the release date of the original CER on the AHRQ website; for example, six months shows the CER was published/released on the AHRQ website six months after the last search date was executed for the CER.

Column 3: total number of studies included in the original CER; for example 251 means a total of 251 studied had met the inclusion criteria of the original CER and were reported in the original CER.

Column 4 presents: 1) time period that the surveillance assessment search has covered (for instance, February 2011 to September 2012 means the search was limited to the period between these dates), and 2) time lag between the original CER search date and the surveillance assessment search date (for example, 13 months means the surveillance literature search was carried out 13 months after the last search date of the original CER).

Column 5 reports the number of studies include in the surveillance assessment; for instance, (15) shows that a total of 15 studies deem eligible and were reported in the surveillance assessment.

Characteristics of SRs

The SRs varied widely in the kinds of interventions they tested and their target populations. Interventions included pharmaceuticals [1114, 16, 17, 2023, 28, 31, 33], surgical procedures [10, 18, 21, 22, 25, 26, 28], radiotherapy [19, 22], non-pharmacological procedures [24, 28, 30], diagnostic and preventive interventions [27, 29], and a complementary and alternative medicine intervention [14]. The populations of interest included patients with cancer, tumors, and anomalies on screening [10, 11, 19, 22, 25], heart disease [18, 20], cystic fibrosis [12], autism [13], trauma [15, 16], cuff tears [21], lipid therapy[17], hypertension [24], renal diseases [26, 31], attention deficit hyperactivity disorder [33], psychiatric and behavioral conditions [32], depression [30], infection [29], pelvic pain [28], sleep apnea [27], and preterm labor [23].

The characteristics of the 24 SRs and the corresponding surveillance assessments are presented in Table 2. Briefly, the number of key questions (the questions that frame AHRQ SRs) across the 24 SRs ranged from three [10, 17, 26, 33] to seven [12, 13, 27], although each key question could comprise any number of subquestions. The total number of conclusions per report that required assessment, a reflection of the number of subquestions, ranged from 7 to 86 (the median number was 23). The median number of included studies in the original SRs was 104 (IQR: 71 to 124). The median number of newly identified studies deemed relevant for inclusion in the SRs was 15 (range: 0 to 35).

The number of experts initially contacted across the 24 SRs ranged from 4 [17] to 17 [30]. The response rates ranged from 20% (2/10) to 100% (6/6) with a median of 35%.

Of the 24 SRs, nine [37%] were considered up-to-date as defined by agreement that all conclusions for all key questions were still up-to-date [12, 15, 21, 2325, 28, 30, 31]. For the remaining 15 (63%) SRs [10, 11, 13, 14, 1620, 22, 26, 27],[29, 32, 33], at least one conclusion was rated as ‘probably/possibly out-of-date’ or ‘out-of-date.’ For four (17%) SRs [10, 19, 20, 22], all conclusions within at least one key question were rated as ‘probably/possibly out-of-date’ or ‘out-of-date’ (see Table 3).
Table 3

Currency of individual conclusions within each key questions of the of 24 comparative effectiveness reviews (CERs) and their priority status for updating (high, medium, and low) based on the updating surveillance assessments

CER title author name (publication date)

Number of conclusions within the key questions in

Updating priority for the CER

 

CER by updating status

(low, medium, and high)

 

KQ#

# Conclusions Up-to-date

KQ#

# Conclusions Possibly out-of-date

KQ#

# Conclusions Probably out-of-date

KQ#

# Conclusions Out-of-date

 

Comparative Effectiveness of Therapies for Clinically Localized Prostate Cancer; Wilt (February 2008) [22]

1

11/15

1

2/15

  

1

2/15

High

       

2

1/1

 
 

3

3/3

       
 

4

1/3

    

4

2/3

 

Comparative Effectiveness of Medications to Reduce Risk of Primary Breast Cancer in Women; Nelson (September 2009) [11]

1

4/6

  

1

2/6

  

Medium

 

2

6/7

  

2

1/7

   
 

3

4/5

  

3

1/5

   
 

4- 5

9/9

       

Comparative Effectiveness of Core Needle Biopsy and Open Surgical Biopsy for Diagnosis of Breast Lesions; Bruening (December, 2009) [10]

1

10/16

1a

4/16

  

1

2/16

Medium

 

2

3/4

2

1/4

     
 

3

1/2

3

1/2

     

Effectiveness of Recombinant Human Growth Hormone (rhGH) in the Treatment of Patients with Cystic Fibrosis; Phung (October 2010) [12]

1-7

40/40

      

Low

Therapies for Children with Autism Spectrum Disorders; Warren (April 2011) [13]

1

10/14

1

4/14

    

Low

 

2

2/3

2

1/3

     
 

3-7

6/6

       

Comparative Effectiveness of Traumatic Brain Injury and Depression; Guillamondegui (April 2011) [15]

1-6

15/15

      

Low

Pain Management Interventions for Hip Fracture; Abou-Setta (May 2011) [14]

1

7/8

1

1/8

    

Low

Diagnosis and Treatment of Obstructive Sleep Apnea in Adults; Balk (August 2011) [27]

1

3/4

1

1/4

    

Medium

 

2

1/1

       
 

3

1/1

       
 

4

1/1

       
 

5

14/15

5

1/15

     
 

6

1/1

       
 

7

1/1

       

Nonpharmacologic Interventions for Treatment-Resistant Depression in Adults; Gaynes (September 2011) [30]

1a

1/1

      

Low

 

1b

1/1

       
 

2

1/1

       
 

3

1/1

       
 

4

1/1

       
 

5

1/1

       
 

6

1/1

       

Attention Deficit Hyperactivity Disorder (ADHD): Effectiveness of Treatment in At-Risk Preschoolers; Long-term Effectiveness in all Ages; and Variability in Prevalence, Diagnosis, and Treatment; Charach (October 2011) [33]

1

2/3

1

1/3

    

Low

 

2

5/6

2

1/6

     
 

3

10/12

  

3

2/12

   

Effectiveness of Early Diagnosis, Prevention, and Treatment of Clostridium difficile Infection; Butler (December 2011) [29]

1

2/3

1

1/3

    

Low

 

2

6/8

2

2/8

     
 

3

6/7

3

1/7

     
 

4

4/5

4

1/5

     

Noncyclic Chronic Pelvic Pain Therapies for Women: Comparative Effectiveness; Andrews (January 2012) [28]

1

5/5

      

Low

 

2

6/6

       
 

3

1/1

       
 

4

6/6

       
 

5

1/1

       
 

all

2/2

       

Chronic Kidney Disease Stages 1 to 3: Screening, Monitoring, and Treatment; Fink (January 2012) [31]

1-6

25/25

      

Low

First and Second Generation Antipsychotics for Children and Young Adults; Seida (February 2012) [32]

1

4/7

1

2/7

  

1

1/7

Low

 

2

3/3

       
 

3

4/4

       
 

4

1/1

       

Comparative Effectiveness of Management Strategies for Renal Artery Stenosis: 2007 Update; Balk (November 2007) [26]

1

7/15

1

8/15

    

Medium

 

2

2/3

2

1/3

     
 

3

4/4

       

Comparative Effectiveness of Radiofrequency Catheter Ablation for Atrial Fibrillation; IP (July 2009) [18]

1

4/4

      

Low

 

2

3/5

2

2/5

     
 

3

3/4

3

1/4

     
 

4

6/6

       

Comparative Effectiveness of Lipid-Modifying Agents; Sharma (September 2009) [17]

1

3/13

    

1

10/13

High

 

2

34/48

2

14/48

     
 

3

9/25

3

16/25

     

Comparative Effectiveness of Angiotensin Converting Enzyme Inhibitors or Angiotensin II Receptor Blockers Added to Standard Medical Therapy for Treating Stable Ischemic Heart Disease; Coleman (October 2009) [20]

1

6/7

1

1/7

    

Low

 

2-6

28/28

       
     

7

4/4

   

Comparative Effectiveness of In-Hospital Use of Recombinant Factor VIIa for Off-Label Indications versus Usual Care; Yank (May 2010) [16]

2

2/3

2

1/3

    

Low

 

3a

7/9

3a

2/9

     
 

3b

3/4

3b

1/4

     
 

4a

1/2

4a

1/2

     
 

4b-c

9/9

       

Comparative Effectiveness and Safety of Radiotherapy Treatments for Head and Neck Cancer; Samson (May 2010) [19]

1

2/3

1

1/3

    

Medium

 

2

1/2

2

1/2

     
   

3

1/1

     
 

4

3/3

       

Comparative Effectiveness of Nonoperative and Operative Treatments for Rotator Cuff Tears; Sedia (July 2010) [21]

1- 6

18/18

      

Low

Comparative Effectiveness of Terbutaline Pump for the Prevention of Preterm Birth; Gaudet (September 2011) [23]

1-6

37/37

      

Low

Self-Measured Blood Pressure Monitoring: Comparative Effectiveness; Uhlig (January 2012) [24]

1

8/8

      

Low

 

2

4/4

       
 

3

4/4

       
 

4

2/2

       
 

5

2/2

       

Hematopoietic Stem-Cell Transplantation in the Pediatric Population; Ratko (February 2012) [25]

1

3/3

      

Low

 

2

3/3

       
 

3

5/5

       
 

4

5/5

       
 

5

5/5

       
 

6

5/5

       

Abbreviations: CER Comparative Effectiveness Review, KQ key question

Legend: Column 1 reports the title and first author of the CERs assessed.

Column 2 reports the currency of each conclusion within each key questions of each CER. For example; CER [22] had a total of four key questions. Key question 1 had a total of 15 conclusions, of which 11 were assessed to be up-to-date, 2 possibly out-of-date and 2 out-of-date based on the surveillance assessment. Column 3 demonstrates the final conclusion (updating priority status) of surveillance assessment for individual CER, for example, if a CER is assessed as high priority for updating, the column shows ‘high’. The symbol # refers to “number” inside the table.

Most SRs were assigned a low priority for updating: two out of 24 SRs (8%) [17, 22] were assigned a ‘high’ priority for updating; five out of 24 (21%) [10, 11, 19, 26, 27] were assigned a medium priority, and the remaining 17 (71%) were assigned a low priority [1216, 18, 20, 21, 2325, 2833] (see Table 3).

Ten SR topics underwent a second surveillance assessment. For those SRs, we contacted only those experts who had responded in the first round. Across these ten SRs, 39 experts were contacted, and 27 responded, with response rates ranging from 40% to 100%. Median response rate was 71%, double the 35% median response rates across all topics on the first round. Across these ten SRs that underwent a second surveillance assessment at about six months from the end of the prior assessment, there were 265 conclusions contained within 53 key questions. Of these, eight conclusions changed between the first and second surveillance: seven conclusions changed from ‘up-to-date’ to ‘possibly out-of-date’, and one conclusion changed from ‘possibly out-of-date’ to ‘probably out-of-date’. One of the ten SRs changed priority for updating from ‘low’ to ‘medium’.

Factors associated with priority decisions

We assessed whether the length of time that had elapsed between the search conducted for the original report and the update surveillance search (search time lapse, STL) was associated with priority status for updating. Seven SRs were released prior to January 2010 [10, 11, 17, 18, 20, 22, 26] (that is, more than 18 months before the start of the Surveillance Program); of these seven, two were the SRs judged as being ‘high’ priority for updating, three were judged as being ‘medium’ priority, and two were judged as being ‘low’ priority for updating. Of the remaining 17 SRs, released after January 2010, only two were judged as being ‘medium’ priority for updating and the rest were low priority. All SRs released within the year prior to the start of the Surveillance Program (between June 2010 and June 2011) were judged as being ‘low’ priority. Figure 2a and b present the updating priority decisions for the 24 SRs by the time elapsed since the search date in the original review (2a) and the number of new relevant articles identified during the surveillance process (2b). While more SRs were classified as medium or high priority for updating as both the STL and the number of new relevant articles increased, there was substantial overlap, and no threshold existed for either time or number of articles that could accurately predict classification of SRs into different categories.
https://static-content.springer.com/image/art%3A10.1186%2F2046-4053-2-104/MediaObjects/13643_2013_Article_166_Fig2_HTML.jpg
Figure 2

The process of surveillance assessment for a Systematic Review. (a) Time elapsed since the search date in the original review. Green color: low priority for updating; Yellow color: medium priority for updating; red color: high priority for updating. (b) Number of new relevant articles identified. Green color: low priority for updating; Yellow color: medium priority for updating; red color: high priority for updating.

The possible role of safety alerts

We identified applicable safety alerts for 9 of the 24 SRs assessed. FDA provided alerts for all nine of those SRs [12, 13, 15, 17, 20, 27, 28, 31],[33]; MHRA and Health Canada were the sources of alerts for only one SR [33]. None of the agents, devices, or procedures evaluated in the 24 SRs for which we performed the surveillance assessments had an FDA black box warning (the strongest FDA warning, indicating a significant risk of serious or even life-threatening adverse effect) issued during our assessment period. In only one case was the updating priority of a SR influenced by a safety alert [27].

Discussion

Our results indicate that a small proportion of AHRQ-supported SRs may need updating within one to two years of the date of their last search. Of the 24 SRs assessed between June 2011 and November 2012, 17 (71%) were classified as having low priority for updating, and five SRs (21%) had medium priority for updating. Only two SRs (8%) were deemed to have high priority. Greater elapsed time from the end date of the original search and a larger number of new relevant studies were both associated with a higher priority for updating, but no thresholds were identified that could perfectly classify SRs into priority categories. This finding suggests that expert opinion will be a necessary component of an efficient system of searching for signals for updating.

Several of the SRs were classified as low priority for updating despite having a large number of newly identified potentially relevant studies. One explanation for this finding is that, in general, many of these new studies had small sample sizes or few primary outcomes and the results were consistent with those of the original SRs, thus not justifying updating those existing SRs. Conversely, the presence of a single new study with many outcome events can be a sufficient signal of the need for a high priority update, such as the publication of the Prostate Cancer Intervention Versus Observation Trial (PIVOT) [52] and the SR on therapies for clinically localized prostate cancer [22].

A recent study that examined factors that predicted 69 decisions on whether to update 41 reviews of drug effectiveness found that the number of relevant new studies was a significant predictor of a decision to update a review (OR 1.06 for each new trial) [6]. This study, conducted for the Drug Effectiveness Review Project (DERP), was designed to examine the surveillance process implemented in 2006 to replace what had been a policy of mandatory annual updates. The DERP process is qualitatively similar to our surveillance method, in that it uses limited literature searches, information from FDA and Health Canada, and expert input. The study also found that identification of a new drug significantly increased the likelihood of an update (OR = 5.71) and that reviews of psychiatric drugs were always recommended for an update. The authors did not report whether there were thresholds of articles or time that perfectly predicted decisions for updating. A major difference between that study and ours, aside from our broader focus on all types of clinical interventions, is that the decision to update a DERP report rests with a panel of participants comprising physicians and representatives of the state Medicaid agencies and the Canadian Agency for Drug Technology and Health, for whom the appearance of a new drug requires them to make policy decision.

Lessons learned

The implementation of the surveillance assessment program to determine the currency of published AHRQ SRs has presented a number of challenges. These challenges included differences across reports in the ways conclusions were presented, the responsiveness of report staff and experts, and delays in the release of the original reports themselves combined with differences in the length of time between release and surveillance.

Inconsistency in presentation of conclusions

Not all SRs presented their KQs and the corresponding conclusions in the executive summary in a similar manner (that is, the degree of detail, format, or level of summarization may have varied). For example, in some SRs, conclusions were, by necessity, stratified by subpopulation, intervention, outcome, or other study characteristics, resulting in multiple conclusions for a single question. In some SRs, the executive summaries failed to present sufficient detail to enable reviewers to extract at least one specific, clearly formulated conclusion for each key question; therefore, the reviewers had to probe the entire text of the SR report. Conversely, some executive summaries simply reproduced the results from the report text without drawing any conclusions, leaving the experts to whom we sent the information to draw their own conclusions. Some conclusions were not readily amenable to updating, for example, conclusions regarding the prevalence of certain risk factors in specific populations.

Responsiveness of report staff and experts

Conducting the surveillance on schedule required that the project leads for the original reports and the experts they recommended we contact respond in a timely manner. However, project leads and experts varied widely in their responsiveness to our requests. In addition, response rates were low in the first surveillance. However, it is unclear what this low response means, since the sample is not intended to be a random sample of some larger population. In the second round of surveillance, the response rate improved considerably, suggesting that over time, the surveillance process will become more efficient.

Delays in release of some reports

In several cases, surveillance was delayed because a report was not released on schedule. The primary impact of such delays was on our staff’s ability to plan their work schedules, as they would have reserved time for these reports and would need to find other surveillance work or work on our own evidence reviews when a report expected for surveillance failed to materialize.

Limitations

One limitation of the surveillance system is that it requires subjective global judgments. The assessment of currency and validity of conclusions for each key question in a SR was based on the totality of information compiled through multiple sources such as the qualitative/quantitative signals, expert opinion, and safety alerts. Although we used operational and standardized definitions throughout the process to promote consistency in the assessments, the overall judgment must necessarily be subjective in characterizing individual conclusions. However, since neither the STL nor the number of new relevant studies can classify SRs perfectly as low, medium, or high priority status for updating, this subjective human assessment is going to be needed in an efficient surveillance system. Future work should seek to make these judgments as reliable as possible across raters. The strength of evidence should be investigated in future work.

A second limitation is that we present data for only 34 surveillance assessments on 24 SRs. However, only two published evaluations have included more assessments than ours. A study by Shojania and colleagues assessed 100 systematic reviews to determine how quickly they go out-of-date, but this study limited its sample to meta-analyses that produced a summary estimate of outcome, and then further limited the analysis to only one outcome per study [7]. The DERP study reported the results of surveillance on 41 of their reports [6], but these reports assessed only drugs, and the decisions about updating were made by stakeholders for whom the approval of a new drug was highly relevant to policy decision-making. Our study, by contrast, assesses a broad array of health care interventions, and considered changes in evidence that might lead to changes in practice as the criterion for a signal for updating.

In sum, we found that only a small proportion of AHRQ-sponsored systematic reviews triggered signals for updating within one or two years of the date of their last search, and that neither the elapsed time since the original search nor the number of new articles could perfectly predict which SRs may be in need of updating. Our experience also provided some evidence into what might be the optimal time for a first assessment and subsequent surveillance assessments. Among the 24 SRs released within the first 18 months of surveillance, only two were classified as high priority, five were classified as medium, and the rest were classified as ‘low’ priority for updating (and a number of these reports had been released up to four years prior to the start of the surveillance). Furthermore, there were few changes in conclusions about updating in a second round of surveillance timed to start six months after the completion of the first round. These results suggest to us that a one-year time period between the release of a report and its first and subsequent surveillance assessments may be more efficient than the six-month time frame chosen for this application.

Conclusion

By undertaking periodic evaluation of 24 topically diverse SRs commissioned by a leading organization, we established the feasibility of a surveillance system to monitor SR currency for a wide range of therapeutic interventions. About 70% of reviews were determined to have a low priority for updating. Evidence suggests that the optimal interval for surveillance is yearly.

For future research, we recommend: 1) modifying and testing the current surveillance methodology to encompass reviews of diagnostic and prognostic methods; 2) validating the surveillance methods against the gold standard of actual review updates in a blinded fashion; and 3) identifying predictors of a review being out-of-date; for example, review quality or the strength of evidence for each individual conclusion; and 4) assessment of the relationship between the quality or strength of evidence and signal detection.

Abbreviations

AHRQ: 

Agency for Healthcare Research and Quality

CER: 

comparative effectiveness reviews

DERP: 

Drug Effectiveness Review Project

ECRI: 

Emergency Care Research Institute

EPC: 

Evidence-based Practice Center

FDA: 

Food and Drug Administration

IQR: 

Interquartile range

KQ: 

Key question

MHRA: 

Medicines and Health Care Products Regulatory Agency

NICE: 

National Institute for Health and Clinical Excellence

OR: 

Odds ratio

SR: 

Systematic review

STL: 

Search time lapse.

Declarations

Acknowledgements

The authors thank ECRI for provision of monthly safety alerts from FDA, MHRA, and Health Canada sources. We also thank Chantelle Garritty for administrative assistance, and Becky Skidmore for acquisition of data for the assessments carried out at the Ottawa EPC.

Grant support

By the Agency for Healthcare Research and Quality, US Department of Health and Human Services, contract number HHSA-290-2007-10062I (RAND Southern California Evidence-based Practice Center) and HHSA-290-2007-10059I (University of Ottawa Evidence-based Practice Center).

Disclaimer

This project was funded under contract number HHSA-290-2007-10062I (RAND Southern California Evidence-based Practice Center) and HHSA-290-2007-10059I (University of Ottawa Evidence-based Practice Center) from the Agency for Healthcare Research and Quality, US Department of Health and Human Services. The authors of this report are responsible for its content. Statements in the report should not be construed as endorsement by the Agency for Healthcare Research and Quality or the US Department of Health and Human Services.

Financial support

Agency for Healthcare Research and Quality of United States (AHRQ).

Authors’ Affiliations

(1)
Knowledge Synthesis Group, Ottawa Hospital Research Institute, Clinical Epidemiology Program, Center for Practice-Changing Research
(2)
Southern California Evidence-based Practice Center (SCEPC), The RAND Corporation
(3)
Stanford University, Stanford University 117 Encina Commons
(4)
Veterans Affairs Greater Los Angeles Healthcare System

References

  1. Helfand M, Balshem H: AHRQ series paper 2: principles for developing guidance: AHRQ and the effective health-care program. J Clin Epidemiol. 2010, 63: 484-490. 10.1016/j.jclinepi.2009.05.005.View ArticlePubMed
  2. Slutsky J, Atkins D, Chang S, Sharp BA: AHRQ series paper 1: comparing medical interventions: AHRQ and the effective health-care program. J Clin Epidemiol. 2010, 63: 481-483. 10.1016/j.jclinepi.2008.06.009.View ArticlePubMed
  3. Tsertsvadze A, Maglione M, Chou R, Garritty C, Coleman C, Lux L, Bass E, Balshem H, Moher D: Updating comparative effectiveness reviews: current efforts in AHRQ’s effective health care program. J Clin Epidemiol. 2011, 64: 1208-1215. 10.1016/j.jclinepi.2011.03.011.View ArticlePubMed
  4. Garritty C, Tsertsvadze A, Tricco AC, Sampson M, Moher D: Updating systematic reviews: an international survey. PLoS One. 2010, 5: e9914-10.1371/journal.pone.0009914.PubMed CentralView ArticlePubMed
  5. Clarke M: TPGS: Response from the Cochrane Collaboration. Lancet. 2008, 371: 384-385.View Article
  6. Peterson K, McDonagh MS, Fu R: Decisions to update comparative drug effectiveness reviews vary based on type of new evidence. J Clin Epidemiol. 2011, 64: 977-984. 10.1016/j.jclinepi.2010.11.019.View ArticlePubMed
  7. Shojania KG, Sampson M, Ansari MT, Ji J, Doucette S, Moher D: How quickly do systematic reviews go out of date? A survival analysis. Ann Intern Med. 2007, 147: 224-233. 10.7326/0003-4819-147-4-200708210-00179.View ArticlePubMed
  8. Shekelle P, Newberry S, Maglione M, Shanman R, Johnsen B, Carter J, Motala A, Hulley B, Wang Z, Bravata D, Chen M, Grossman J: Assessment of the need to update comparative effectiveness reviews: report of an initial rapid program assessment (2005 to 2009). [http://​www.​effectivehealthc​are.​ahrq.​gov/​ehc/​products/​125/​331/​2009_​0923UpdatingRepo​rts.​pdf]
  9. Chung M, Newberry SJ, Ansari MT, Yu WW, Wu H, Lee J, Suttorp M, Gaylor JM, Motala A, Moher D: Two methods provide similar signals for the need to update systematic reviews. J Clin Epidemiol. 2012, 65: 660-668. 10.1016/j.jclinepi.2011.12.004.PubMed CentralView ArticlePubMed
  10. Bruening W, Schoelles K, Treadwell J, Launders J, Fontanarosa J, Tipton K: Comparative effectiveness of core-needle and open surgical biopsy for the diagnosis of breast lesions. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK45220/​pdf/​TOC.​pdf]
  11. Nelson HD, Fu R, Humphrey L, Smith MEB, Griffin JC, Nygren P: Comparative effectiveness of medications to reduce risk of primary breast cancer in women. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK36430/​pdf/​TOC.​pdf]
  12. Phung OJ, Coleman CI, Baker EL, Scholle JM, Girotto JE, Makanji SS, Chen WT, Talati R, Kluger J, Quercia R, Mather J, Giovenale S, White CM: Effectiveness of recombinant human growth hormone (rhGH) in the treatment of patients with cystic fibrosis. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK61941/​pdf/​TOC.​pdf]
  13. Warren Z, Veenstra-Vanderweele J, Stone W, Bruzek JL, Nahmias AS, Foss-Feig JH, Jerome RN, Krishnaswami S, Sathe NA, Glasser AM, Surawicz T, McPheeters ML: Therapies for children with autism spectrum disorders. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK56343/​pdf/​TOC.​pdf]
  14. Abou-Setta AM, Beaupre LA, Jones CA, Rashiq S, Hamm MP, Sadowski CA, Menon MRG, Majumdar SR, Wilson DM, Karkhaneh M, Wong K, Mousavi SS, Tjosvold L, Dryden DM: Pain management interventions for hip fracture. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK56670/​pdf/​TOC.​pdf]
  15. Guillamondegui OD, Montgomery SA, Phibbs FT, McPheeters ML, Alexander PT, Jerome RN, McKoy JN, Seroogy JJ, Eicken JJ, Krishnaswami S, Salomon RM, Hartmann KE: Traumatic brain injury and depression. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK62061/​pdf/​TOC.​pdf]
  16. Yank V, Tuohy CV, Logan AC, Bravata D, Staudenmayer K, Eisenhut R, Sundaram V, McMahon D, Stave CD, Zehnder JL, Olkin I, McDonald KM, Owens DK, Stafford RS: Comparative effectiveness of recombinant factor VIIa for off-label indications versus usual care. [http://​www.​effectivehealthc​are.​ahrq.​gov/​ehc/​products/​20/​450/​Final%20​Report_​CER21_​Factor7.​pdf]
  17. Sharma M, Ansari MT, Soares-Weiser K, Abou-Setta AM, Ooi TC, Sears M, Yazdi F, Tsertsvadze A, Moher D: Comparative effectiveness of lipid-modifying agents. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK43220/​pdf/​TOC.​pdf]
  18. Ip S, Terasawa T, Balk EM, Chung M, Alsheikh-Ali AA, Garlitski AC, Lau J: Comparative effectiveness of radiofrequency catheter ablation for atrial fibrillation. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK43190/​pdf/​TOC.​pdf]
  19. Samson DJ, Ratko TA, Rothenberg BM, Brown HM, Bonnell CJ, Ziegler KM, Aronson N: Comparative effectiveness and safety of radiotherapy treatments for head and neck cancer. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK45242/​pdf/​TOC.​pdf]
  20. Coleman CI, Baker WL, Kluger J, Reinhart K, Talati R, Quercia R, Mather J, Giovenale S, White CM: Comparative effectiveness of angiotensin converting enzyme inhibitors or angiotensin II receptor blockers added to standard medical therapy for treating stable ischemic heart disease. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK36476/​pdf/​TOC.​pdf]
  21. Seida JC, Schouten JR, Mousavi SS, Tjosvold L, Vandermeer B, Milne A, Bond K, Hartling L, Le Blanc C, Sheps DM: Comparative effectiveness of nonoperative and operative treatments for rotator cuff tears. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK47305/​pdf/​TOC.​pdf]
  22. Wilt TJ, Shamliyan T, Taylor B, MacDonald R, Tacklind J, Rutks I, Koeneman K, Cho CS, Kane RL: Comparative effectiveness of therapies for clinically localized prostate cancer. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK43147/​]
  23. Gaudet L, Singh K, Weeks L, Skidmore B, Tsouros S, Tsertsvadze A, Daniel R, Doucette S, Walker M, Ansari M: Terbutaline pump for the prevention of preterm birth. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK82399/​pdf/​TOC.​pdf]
  24. Uhlig K, Balk EM, Patel K, Ip S, Kitsios GD, Obadan NO, Haynes SM, Stefan M, Rao M, Kong W, Chang L, Gaylor J, Iovin RC: Self-measured blood pressure monitoring: comparative effectiveness. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK84604/​]
  25. Ratko TA, Belinson SE, Brown HM, Noorani HZ, Chopra RD, Marbella A, Samson DJ, Bonnell CJ, Ziegler KM, Aronson N: Hematopoietic stem-cell transplantation in the pediatric population. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK84626/​]
  26. Balk E, Raman G: Comparative effectiveness of management strategies for renal artery stenosis: 2007 update. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK43104/​]
  27. Balk EM, Moorthy D, Obadan NO, Patel K, Ip S, Chung M, Bannuru RR, Kitsios GD, Sen S, Iovin RC, Gaylor JM, D'Ambrosio C, Lau J: Diagnosis and treatment of obstructive sleep apnea in adults. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK63560/​]
  28. Andrews J, Yunker A, Reynolds WS, Likis FE, Sathe NA, Jerome RN: Noncyclic chronic pelvic pain therapies for women: comparative effectiveness. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK84586/​]
  29. Butler M, Bliss D, Drekonja D, Filice G, Rector T, MacDonald R, Wilt T: Effectiveness of early diagnosis, prevention, and treatment of clostridium difficile infection. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK83519/​]
  30. Gaynes BN, Lux LJ, Lloyd SW, Hansen RA, Gartlehner G, Keener P, Brode S, Evans TS, Jonas D, Crotty K, Viswanathan M, Lohr KN: Nonpharmacologic interventions for treatment-resistant depression in adults. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK65315/​]
  31. Fink HA, Ishani A, Taylor BC, Greer NL, MacDonald R, Rossini D, Sadiq S, Lankireddy S, Kane RL, Wilt TJ: Chronic kidney disease stages 1–3: screening, monitoring, and treatment. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK84564/​]
  32. Seida JC, Schouten JR, Mousavi SS, Hamm M, Beaith A, Vandermeer B, Dryden DM, Boylan K, Newton AS, Carrey N: First & second generation antipsychotics for children and young adults. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK84643/​]
  33. Charach A, Dashti B, Carson P, Booker L, Lim CG, Lillie E, Yeung E, Ma J, Raina P, Schachar R: Attention Deficit Hyperactivity Disorder (ADHD): effectiveness of treatment in at-risk preschoolers; long-term effectiveness in all ages; and variability in prevalence, diagnosis, and treatment. [http://​www.​ncbi.​nlm.​nih.​gov/​books/​NBK82368/​]
  34. Shekelle PG, Ortiz E, Rhodes S, Morton SC, Eccles MP, Grimshaw JM, Woolf SH: Validity of the agency for healthcare research and quality clinical practice guidelines: how quickly do guidelines become outdated?. JAMA. 2001, 286: 1461-1467. 10.1001/jama.286.12.1461.View ArticlePubMed
  35. Wilt TJ, Shamliyan TA, Taylor BC, MacDonald R, Kane RL: Association between hospital and surgeon radical prostatectomy volume and patient outcomes: a systematic review. J Urol. 2008, 180: 820-828. 10.1016/j.juro.2008.05.010.View ArticlePubMed
  36. Nelson HD, Fu R, Griffin JC, Nygren P, Smith ME, Humphrey L: Systematic review: comparative effectiveness of medications to reduce risk for primary breast cancer. Ann Intern Med. 2009, 151: 703-715. 10.7326/0000605-200911170-00147.View ArticlePubMed
  37. Bruening W, Fontanarosa J, Tipton K, Treadwell JR, Launders J, Schoelles K: Systematic review: comparative effectiveness of core-needle and open surgical biopsy to diagnose breast lesions. Ann Intern Med. 2010, 152: 238-246. 10.7326/0003-4819-152-1-201001050-00190.View ArticlePubMed
  38. Phung OJ, Coleman CI, Baker EL, Scholle JM, Girotto JE, Makanji SS, Chen WT, Talati R, Kluger J, White CM: Recombinant human growth hormone in the treatment of patients with cystic fibrosis. Pediatrics. 2010, 126: e1211-e1226. 10.1542/peds.2010-2007.View ArticlePubMed
  39. Warren Z, McPheeters ML, Sathe N, Foss-Feig JH, Glasser A, Veenstra-Vanderweele J: A systematic review of early intensive intervention for autism spectrum disorders. Pediatrics. 2011, 127: e1303-e1311. 10.1542/peds.2011-0426.View ArticlePubMed
  40. Abou-Setta AM, Beaupre LA, Rashiq S, Dryden DM, Hamm MP, Sadowski CA, Menon MR, Majumdar SR, Wilson DM, Karkhaneh M: Comparative effectiveness of pain management interventions for hip fracture: a systematic review. Ann Intern Med. 2011, 155: 234-245. 10.7326/0003-4819-155-4-201108160-00346.View ArticlePubMed
  41. Ip S, D’Ambrosio C, Patel K, Obadan N, Kitsios GD, Chung M, Balk EM: Auto-titrating versus fixed continuous positive airway pressure for the treatment of obstructive sleep apnea: a systematic review with meta-analyses. Syst Rev. 2012, 1: 20-10.1186/2046-4053-1-20.PubMed CentralView ArticlePubMed
  42. Drekonja DM, Butler M, MacDonald R, Bliss D, Filice GA, Rector TS, Wilt TJ: Comparative effectiveness of Clostridium difficile treatments: a systematic review. Ann Intern Med. 2011, 155: 839-847. 10.7326/0003-4819-155-12-201112200-00007.View ArticlePubMed
  43. Yunker A, Sathe NA, Reynolds WS, Likis FE, Andrews J: Systematic review of therapies for noncyclic chronic pelvic pain in women. Obstet Gynecol Surv. 2012, 67: 417-425. 10.1097/OGX.0b013e31825cecb3.View ArticlePubMed
  44. Fink HA, Ishani A, Taylor BC, Greer NL, MacDonald R, Rossini D, Sadiq S, Lankireddy S, Kane RL, Wilt TJ: Screening for, monitoring, and treatment of chronic kidney disease stages 1 to 3: a systematic review for the US Preventive Services Task Force and for an American College of Physicians clinical practice guideline. Ann Intern Med. 2012, 156: 570-581. 10.7326/0003-4819-156-8-201204170-00008.View ArticlePubMed
  45. Seida JC, Schouten JR, Boylan K, Newton AS, Mousavi SS, Beaith A, Vandermeer B, Dryden DM, Carrey N: Antipsychotics for children and young adults: a comparative effectiveness review. Pediatrics. 2012, 129: e771-e784. 10.1542/peds.2011-2158.View ArticlePubMed
  46. Terasawa T, Balk EM, Chung M, Garlitski AC, Alsheikh-Ali AA, Lau J, Ip S: Systematic review: comparative effectiveness of radiofrequency catheter ablation for atrial fibrillation. Ann Intern Med. 2009, 151: 191-202. 10.7326/0003-4819-151-3-200908040-00131.View ArticlePubMed
  47. Sharma M, Ansari MT, Abou-Setta AM, Soares-Weiser K, Ooi TC, Sears M, Yazdi F, Tsertsvadze A, Moher D: Systematic review: comparative effectiveness and harms of combination therapy and monotherapy for dyslipidemia. Ann Intern Med. 2009, 151: 622-630. 10.7326/0003-4819-151-9-200911030-00144.View ArticlePubMed
  48. Baker WL, Coleman CI, Kluger J, Reinhart KM, Talati R, Quercia R, Phung OJ, White CM: Systematic review: comparative effectiveness of angiotensin-converting enzyme inhibitors or angiotensin II-receptor blockers for ischemic heart disease. Ann Intern Med. 2009, 151: 861-871. 10.7326/0000605-200912150-00162.View ArticlePubMed
  49. Yank V, Tuohy CV, Logan AC, Bravata DM, Staudenmayer K, Eisenhut R, Sundaram V, McMahon D, Olkin I, McDonald KM: Systematic review: benefits and harms of in-hospital use of recombinant factor VIIa for off-label indications. Ann Intern Med. 2011, 154: 529-540. 10.7326/0003-4819-154-8-201104190-00004.PubMed CentralView ArticlePubMed
  50. Seida JC, LeBlanc C, Schouten JR, Mousavi SS, Hartling L, Vandermeer B, Tjosvold L, Sheps DM: Systematic review: nonoperative and operative treatments for rotator cuff tears. Ann Intern Med. 2010, 153: 246-255. 10.7326/0003-4819-153-4-201008170-00263.View ArticlePubMed
  51. Gaudet LM, Singh K, Weeks L, Skidmore B, Tsertsvadze A, Ansari MT: Effectiveness of terbutaline pump for the prevention of preterm birth. A systematic review and meta-analysis. PLoS One. 2012, 7: e31679-10.1371/journal.pone.0031679.PubMed CentralView ArticlePubMed
  52. Wilt TJ, Brawer MK, Jones KM, Barry MJ, Aronson WJ, Fox S, Gingrich JR, Wei JT, Gilhooly P, Grob BM: Radical prostatectomy versus observation for localized prostate cancer. N Engl J Med. 2012, 367: 203-213. 10.1056/NEJMoa1113162.PubMed CentralView ArticlePubMed

Copyright

© Ahmadzai et al.; licensee BioMed Central Ltd. 2013

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement