Skip to main content

A systematic review of validity of US survey measures for assessing substance use and substance use disorders

Abstract

Background

The steep rise in substance use and substance use disorder (SUD) shows an urgency to assess its prevalence using valid measures. This systematic review summarizes the validity of measures to assess the prevalence of substance use and SUD in the US estimated in population and sub-population-based surveys.

Methods

A literature search was performed using nine online databases. Studies were included in the review if they were published in English and tested the validity of substance use and SUD measures among US adults at the general or sub-population level. Independent reviews were conducted by the authors to complete data synthesis and assess the risk of bias.

Results

Overall, 46 studies validating substance use/SUD (n = 46) measures were included in this review, in which 63% were conducted in clinical settings and 89% assessed the validity of SUD measures. Among the studies that assessed SUD screening measures, 78% examined a generic SUD measure, and the rest screened for specific disorders. Almost every study used a different survey measure. Overall, sensitivity and specificity tests were conducted in over a third of the studies for validation, and 10 studies used receiver operating characteristics curve.

Conclusion

Findings suggest a lack of standardized methods in surveys measuring and reporting prevalence of substance use/SUD among US adults. It highlights a critical need to develop short measures for assessing SUD that do not require lengthy, time-consuming data collection that would be difficult to incorporate into population-based surveys assessing a multitude of health dimensions.

Systematic review registration

PROSPERO CRD42022298280.

Peer Review reports

Introduction

Substance use remains a serious adverse health risk in the United States (US). Forty million Americans reported illicit drug use in the past month in 2021, among people aged 12 years or older (Substance Abuse and Mental Health Services Administration, 2022b), with over 106,000 people in the US fatally overdosing in 2021 (National Institute on Drug Abuse, 2023). This is a dramatic increase of approximately 15% in overdoses within 1 year, signifying critical, life-threatening substance use problems and an associated overdose epidemic throughout the county. Notably, substance use problems that met the criteria for a substance use disorder (SUD) were reported by a sizeable proportion of the US population. More than 46 million people aged 12 years or older met the Diagnostic Statistical Manual of Mental Disorders (DSM-V) criteria for SUD in the past year, according to the National Survey of Drug Use and Health (NSDUH), with the highest percentage of people with SUD being young adults aged 18–25 (25.6%), followed by adults aged 26 or older (16.1%) [1]. Unfortunately, population-based assessments for SUD are rare beyond the NSDUH, especially at substate levels, although imperative to inform appropriate resource allocation and population-based interventions for states responding to the SUD and overdose epidemics.

There are few population-based surveys conducted in the US that assess substance use and/or SUD. NSDUH is a good example of a survey that monitors annual national trends in substance use and mental health issues in the US and provides estimates of the need for substance use prevention and treatment programs [2]. However, it involves lengthy questions and branching logic that are not feasible for use in surveys covering multiple health domains. Another validated tool to assess SUD is the National Addictions Vigilance Intervention and Prevention Program (NAVIPPRO™) Addiction Severity Index-Multimedia Version® (ASI-MV®) [3]. However, results from this measure may not be generalizable because it is only used to evaluate those already seeking SUD treatment. In addition, selection bias is likely because the participants are selected based on convenience sampling among treatment centers [4]. Other measures that have been validated for assessing substance use in the US are Drug Abuse Screening Test (DAST) [5], Alcohol, Smoking, and Substance involvement Screening Test (ASSIST), and tobacco, alcohol, prescription medication, and other substance use (TAPS) [6]. However, these survey measures also require multiple, lengthy questions to estimate the prevalence of SUD.

Validated substance use and SUD measures that are shorter and more versatile are needed to ease the incorporation of these measures into more multidimensional population health surveys to better assess and respond to the current US substance use and overdose epidemics. Much work has been done on validating alcohol and tobacco measures, such as ASSIST and TAPS [6]. We know of no review of validation research conducted on other substance use and/or SUD measures among the US population, although previous studies provide valuable insights into measures assessing the efficacy of substance use measures and interventions [7] and addressing psychometric properties of screening tools among specific settings or populations [8]. Thus, the purpose of this review is to comprehensively summarize published literature investigating the validity of substance use and SUD measures, other than alcohol and tobacco use, in US surveys to advance the use of these validated measures on more population-based surveys.

Methods

Search strategy

This systematic review has followed the Preferred Reporting Items for Systematic reviews and Meta-analyses (PRISMA) guidelines [9] and was registered through PROSPERO (CRD42022298280). Potential eligible studies were identified by using the following nine electronic databases, starting from their inception up to November 22, 2021: PubMed, Scopus, CINAHL, PsycINFO, Academic Search Complete, Web of Science, ProQuest Theses and Dissertation Global, and Google Scholar. Primary keywords and phrases used for searching included “healthcare survey,” “mental health,” “substance use,” and “validity.” Detailed search strategies corresponding to the specific databases are shown in Supplementary Table 1.

The following study inclusion criteria were established a priori for use in this systematic review: [1] Utilized existing surveys or questionnaires at the county level or higher (validation may have been done at a sub-population level) or at clinical settings in the US; [2] to ensure the reviewed measures are applicable to US populations, and only studies conducted in the US were included in this review; [3] validity/validation testing conducted for measures of mental health and/or substance use; [4] study sample consisted of adults 18 years of age or older; [5] studies published in English language; and [6] peer-reviewed, published studies, official reports from surveys, and doctoral dissertations. In addition, exclusion criteria were applied to those studies that [1] assessed the validity of measures unrelated to mental health/substance use (i.e., physical activity, chronic disease, infectious disease); [2] assessed the validity of alcohol and/or tobacco measures only; [3] were published as abstract only or did not have full texts available; [4] were protocols, editorials, reviews, or commentary; [5] validated language translation or cultural version of an instrument; and [6] were conducted internationally. In order to better align with the aims of this review, studies validating only alcohol/tobacco use measures were excluded because they have been widely studied in previous literature [10,11,12,13].

Quality assessment

An adapted risk-of-bias tool was developed for the purpose of this systematic review to assess the validity of substance use and mental health survey instruments. This methodological quality assessment tool was adapted from a previously published tool which evaluated the rigor of validity testing in the Behavioral Risk Factor Surveillance System (BRFSS) literature [14]. The new risk-of-bias tool was used to assess the quality of the [1] methodology and [2] statistical analyses of studies included in the systematic review. The methodological component was scored from 0 to 3 (3 = studies utilizing a physical measurement(s) as a comparator during validity testing, which were considered to be the “gold standard,” 2 = studies using measures other than actual physical measure, 1 = studies that conducted face validity based on the researcher’s judgment or a collective judgment, 0 = studies that did not report on the measurement used for validity testing). The statistical analysis component was scored from 0 to 2 (2 = using statistical analyses such as sensitivity and specificity, correlation coefficient, or mean difference, 1 = reporting prevalence estimates only, 0 = no information on statistical analysis was reported). The methodological and statistical component scores were then totaled for an overall quality assessment score. Total scores ranged from 0 to 5, with 5 demonstrating the highest quality.

Data synthesis

All identified studies were imported to an EndNote library. After removing duplicates, the initial title and abstract screenings were conducted independently by three reviewers (Y. T., N. W., E. O.) using the pre-established inclusion and exclusion criteria. It was followed by the full-text review conducted independently by three reviewers (Y. T., E. C., R. M.) for the first 10% of the included studies. They then convened to review their selections to ensure agreement and refine criteria. Inter-rater reliability was calculated in STATA [15] using the Gwet’s AC to ensure agreement [16]. The remaining 90% of the selected articles were then split between the three reviewers for full-text review. Articles where a reviewer was not sure if they should be included or excluded were discussed among the three reviewers and decided by the senior author for final selection.

A data extraction form was created in Microsoft Excel to facilitate data extraction and synthesis. The form could capture up to 46 variables for each study. These variables were grouped into four main categories: study characteristics (authors, reference, year of publication, and name of journal), measure characteristics (whether the measure was used for disorder screening, the SUD being assessed by the measure, response rate, study duration, items measured, recall period, and recruitment procedure), participant characteristics (overall health status, age, sex, race, income, education), and validation methods (type of validation, statistical analysis, comparison measure, and key results). Additionally, a single article could be considered as multiple studies if it validated measures among multiple study populations. Articles that validated multiple survey measures among the same study population were considered to be one study. We evaluated the different types of validity using pre-established definitions to standardize the understanding of validity among reviewers. Our focus was on examining criterion validity (including concurrent, predictive, and content validity) and construct validity (encompassing convergent, discriminant, and factorial validity). Specifically, criterion validity was examined through comparisons with “gold standard” measures where available or through the use of clinically established diagnostic criteria and outcomes. Face validity was determined if the article could demonstrate the extent to which a substance use measured what it intended to measure. Lastly, construct validity was assessed through statistical analyses examining the correlation between survey measures and related constructs, thus ensuring that measures accurately reflect the theoretical components of substance use and SUDs. Articles that did not specify the validation methods were discussed among the three reviewers and decided by the senior author for consensus if discrepancies existed.

All data were coded independently by two reviewers (Y. T., E. C.). After extracting data from the first 10 articles, the two reviewers met to discuss any discrepancies among coding strategies. Disagreements were brought to the senior author (R. B.) for conflict resolution. Although the inclusion and exclusion criteria were determined a priori, the completion of data extraction demonstrated unique differences present between mental health and substance use studies that evaluated the psychometric properties of their respective measures. As the study developed, the results gathered from the data synthesis for substance use were substantially different from mental health assessment, and the authors determined that these separate domains would be better discussed in two separate manuscripts. Thus, the results presented in this study are from studies that validated substance use measures identified in our search.

Results

Study characteristics

A total of 6950 results were initially obtained from the search. An additional 153 articles were identified by reviewing BRFSS reference lists [17]. A flow diagram documenting the search process and reasons for excluding studies is shown in Fig. 1. Of the 7103 articles, 2339 were duplicates and were excluded before the abstract/title review. After reviewing 4764 abstracts/titles, 3744 articles were excluded. Of the 1020 articles, a full-text review of the first 10% of articles demonstrated an almost perfect inter-rater reliability agreement between reviewers on which articles met the inclusion criteria (Gwet’s AC: 0.8517 (0.8000–1.0000)). Following review of the full article text, 899 articles were removed. The key reasons for excluding the articles were because they [1] did not conduct validity testing (n = 874), [2] were conducted outside the United States (n = 1105), or [3] were focused on topics other than substance use (n = 878). For this review, a total of 46 articles met the inclusion criteria (Fig. 1). The characteristics of those 46 selected studies are presented in Table 1.

Fig. 1
figure 1

Flow chart for the selection of studies*. *Studies could have been excluded for multiple reasons

Table 1 Characteristics of included studies of validation testing

The included studies were published between 1979 and 2021, with a wide variation in demographic characteristics. Of the 46 studies, seven had over 80% male participants (Han et al., 2017; Peters et al., 2000; Tiet et al., 2016; Tiet et al., 2019; Tiet et al., 2015; Tiet et al., 2017; Zanis et al., 1994). Among these studies, two recruited only male participants (Peters et al., 2000; Zanis et al., 1994). Additionally, there was one study that only recruited female participants [22]. Racial and ethnic differences also existed among these study samples. Six studies had study sample of primarily (70% or more) White participants [41, 58], and three studies had a study population sample of 70% or more Black/African American (AA) participants [32, 42, 58]. Furthermore, 12 studies only recruited White and Black/AA participants [6, 18, 21, 24, 25, 31, 32, 36, 39, 41, 58, 59]. Seven studies (15.2%) did not report information on race/ethnicity characteristics [19, 22, 27, 34, 55, 57].

All 44 studies included in this review reported the final sample size, with a mean of 1427 (median = 449) participants with an overall range of 23–10,167 participants. Only 13 studies reported response rate, and the response rates ranged between 13.4 [18] and 100% [32]. Twenty-six studies reported the survey duration, and it ranged from 1 month [32] to 120 months [20], with mean 28.48 months (median 13 months). Moreover, studies reported the mean age of the participants as < 30 years (n = 4), between 30 and 39 years (n = 16), and ≥ 40 years (n = 18). Another eight studies reported age groups or median age of the study population. Additionally, a majority (n = 37) of the studies were conducted in non-population-based clinical settings (e.g., inpatient, outpatient).

Participant recruitment strategy

The participant recruitment strategies from included studies in this review were shown in Table 2. Of the 46 studies, only 4% (n = 2) examined SUD in the general population [20, 28]; the rest (n = 44) of the studies were conducted in clinical or other population subgroups. In the first population-based study, 6664 adult Medicaid enrollees were recruited from 1 of 7 Florida regions who took part in the Florida Health Services Survey at least once between 1998 and 2008 [20]. Researchers assessed the internal psychometric properties of the Simple Screening Instrument for Substance Abuse (SSI-SA) but did not compare survey responses with SUD diagnoses in Medicaid clinical records. In the second population-based study, participants were selected from the National Epidemiologic Survey on Alcohol and Related Conditions-III (NESARC-III) sample, which included noninstitutionalized US adult residents (aged 18 years or older) [28]. The authors then selected 777 respondents for the procedural validity study and used a test–retest design to compare concordance of respondents’ answers to the NESARC-III survey questions with a semi-structured interview, the Psychiatric Research Interview for Substance and Mental Disorders, DSM-5 version (PRISM-5), administered by a clinician.

Table 2 Participant recruitment strategies

Of the remaining 44 studies not in the general population, over three quarters (n = 35) were conducted in the clinical setting, with the majority (n = 19) in the inpatient setting [6, 19, 21, 22, 24, 26, 32, 35,36,37,38,39, 42, 46, 48,49,50, 59]. Eleven studies were conducted in the outpatient clinical setting [5, 23, 31, 40, 43, 47, 51,52,53,54,55], 5 studies were conducted in both the inpatient and outpatient settings (Harris et al., 2015; Hser et al., 1999; Kellogg et al., 2002; Kupetz et al., 1979; Salyers et al., 2000), and 1 study was conducted in a Veterans’ Administration shelter [58]. The remaining studies (n = 8) were conducted outside the clinical setting. For example, participants were recruited from an alcohol and drug program [18], prison substance abuse treatment programs [25], Holiday Transfer Facility [41], and a novel jail-release program [56]. Lastly, four studies consisted of sub-population samples within the National Epidemiologic Survey on Alcohol and Related Conditions-III (NESARC-III) [29] and universities using student participants [45, 57].

Quality of studies

Risk of bias was assessed based upon the methodology used for instrument comparison and the statistical analysis conducted. Although several studies adopted recruitment strategies that limited their study population to specific groups (for example, only recruiting male or white populations), the risk-of-bias assessment employed by the current study did not account for recruitment. As a result, most of the included studies (n = 41) had a risk-of-bias score of 4 or higher (Table 1). Two studies had a score of 3 [30, 34], one studies had a score of 2 [29, 48], and two studies had a score of only 1 [27, 44]. Among those studies with low-quality assessment scores, four studies lacked statistical comparisons and reported prevalence estimates only [27, 30, 34, 44]. There were three studies that did not report on validation methodology [27, 44, 48].

Survey measure

Among the articles included in this review, 89% (n = 41) used measures specifically designed for screening SUDs. For example, seven studies tested the validity of the measure’s ability to screen for a specific SUD, including marijuana use [18, 21, 29, 40, 45], cocaine use [30], and opioid use [56]. Five studies validated measures for both substance use and mental health [23, 25, 40, 44, 54], of which one study used a measure for post-traumatic stress disorder (PTSD) screening [54]. The rest of the included studies did not specify a specific SUD for screening purposes but used a generic term for defining SUD. All measures and their frequency of use in the included studies are depicted in Fig. 2.

Fig. 2
figure 2

Frequency of survey measures used in included studies. Abbreviations in order: Texas Christian University Drug Screen (TCUDS), Substance Use and Abuse Survey (SUAS), the Simple Screening Instrument for Substance Abuse (SSI-SA), the Simple Screening Instrument (SSI), screen of drug use (SoDU), single-item screening questions (SISQs), Substance Dependence Severity Scale (SDSS), Substance Abuse Subtle Screening Inventory-2 (SASSI-2), Rapid Opioid Dependence Screen (RODS), Personality Assessment Inventory Drug Problem Scale (PAI DRG), National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), the Marijuana Screening Inventory (MSI-X), the Longitudinal Substance Use Recall Instrument Recall for 12 Weeks instrument (LSUR-12), the Longitudinal Substance Use Recall Instrument (LSUR), Lifetime Severity Index for Cocaine Use Disorder (LSI-Cocaine), Healthcare Effectiveness Data and Information Set (HEDIS), the Drug Use Screening Inventory (DUSI), Dartmouth Assessment of Lifestyle Instrument (DALI), Cut down, Annoyed, Guilty, and Eye-Opener Substance Abuse Screening Tool (CAGE), the Alcohol Use Disorder and Associated Disabilities Interview Schedule (AUDADIS), Alcohol, Smoking, and Substance Involvement Screening Test-Drug (ASSIST-Drug), Parents, Partners, Past, and Pregnancy Plus (4P’s Plus), tobacco, alcohol, prescription medication, and other substance use (TAPS tool), Substance Use Brief Screen (SUBS), single question used from short inventory of problems-drug use (SIP-DU), Drug Abuse Screening Test (DAST), the Chemical Use, Abuse, and Dependence (CUAD), Addiction Severity Index (ASI)

The majority of studies validated one single measure, of which five studies validated the Addiction Severity Index (ASI), [32, 35, 41, 58, 59] and one study validated drug use subscales of ASI [26, 37]. Five studies validated multiple survey measures:

  1. (1)

    Duncan et al. validated two survey measures: (a) CJDAT Co-Occurring Disorders Screening Instruments for any Mental Disorder (CODSI-MD) and (b) CJDAT Co-Occurring Disorders Screening Instruments for Severe Mental Disorder (CODSI-SMD) [25].

  2. (2)

    Ramsay et al. also validated two different survey measures: (a) The Lifetime Substance Use Recall Instrument (LSUR) and (b) the Longitudinal Substance Use Recall for 12 Weeks instrument (LSUR-12) [42].

  3. (3)

    Peters et al. and Tiet et al. also validated two measures: (a) The Substance Use Brief Screen (SUBS) and (b) the DAST [5, 41]. O’Hare et al. validated four different survey measures: (a) South Shore Problem Inventory-revised (SSPI), (b) self-rated substance abuse (SRSA), (c) quantity-frequency index for alcohol consumption (QFI), and (d) one-item index measuring the frequency of marijuana use [40].

  4. (4)

    Peters et al. validated five different survey measurements: (a) ASI-drug use subscales, (b) DAST, (c) Substance Abuse Subtle Screening Inventory-2 (SASSI-2), (d) SSI, and (e) Texas Christian University Drug Screen (TCUDS) [41].

  5. (5)

    Tiet et al. conducted validations of seven survey measures: (a) PTSD Checklist–Civilian version (PCL-C), (b) PTSD Checklist 4 Item (PCL-Bliese-4), (c) PTSD Checklist 2 Item (PCL-LS-2), (d) PTSD Checklist 3 Item (PCL-LS-3), (e) PTSD Checklist 4 Item (PCL-LS-4), (f) PTSD Checklist 6 Item (PCL-LS-6), and (g) Primary Care–PTSD screen (PC-PTSD) [54].

Two studies conducted survey measure validation in different study populations. One study conducted a preliminary exploration of the psychometric properties of the Substance Use Risk Profile Scale (SURPS) in 3 different populations: 195 undergraduate drinkers, 390 undergraduate students from Stony Brook University, and 4234 high school students in Canada [57]. In the second study, data were collected from two separate adult clinical samples — seriously mentally ill inpatients and patients presenting for evaluation at a chemical dependence program — to describe the rationale and test validity and reliability of the Chemical Use, Abuse, and Dependence Scale (CUAD) [36].

Comparison measures for validation

Several different types of measures were used as comparison for the purpose of validation. Higher quality comparison measures included items such as medical records, diagnoses, medical test results, or other SUD severity scales. A total of 10 studies conducted validity testing using at least one of these higher-quality comparison measures. Of these, three studies conducted criterion validity testing by comparing the following: (1) positive and negative 4P’s Plus screens with positive and negative clinical assessment [22], (2) the Alcohol Use Disorder and Associated Disabilities Interview Schedule (AUDADIS) with psychiatrist diagnosis [28], and (3) Dartmouth Assessment of Lifestyle Instrument (DALI) with clinician diagnosis [43]. The remaining two studies conducted validity testing by comparing the following: (1) Substance Use and Abuse Survey (SUAS) with medical chart [34] and (2) CUDIT-R with DSM-T diagnostic severity levels [45]. Another study compared the CUAD-derived DSM-III-R substance use disorder diagnoses with the chart diagnosis determined by the unit psychiatrists for validation [36].

Furthermore, two studies validated their measures by comparing with diagnostic standards: (1) Compared Cut down, Annoyed, Guilty, and Eye-Opener Substance Abuse Screening Tool (CAGE) with SCID-generated drug use disorder diagnoses as the standard [24] and (2) Compared the Cannabis Use Disorders Identification Test Revised (CUDIT-R) with ICD-10 dependence diagnosis [39]. Three studies conducted validity testing by comparing with laboratory test results, including urine test [31, 58, 59] and saliva drug testing [38].

Four studies conducted validity testing by comparing other severity scales: (1) Criterion validity testing by comparing the Marijuana Screening Inventory (MSI-X) with three different severity rating scales and selected variables [18], (2) construct validity testing by comparing Personality Assessment Inventory Drug Problem Scale (PAI DRG) with ASI drug composite scores and severity ratings [33], (3) construct validity testing by comparing ASI with interviewer severity ratings and composite scores [35], and (4) concurrent validity of ASI drug scale and examined 25 participants who had drug metabolites detected in a urine sample obtained during the first interview and compare this result with their self-reported use of drugs during the 30-day assessment period in ASI interview [58].

Types of validity assessed and statistical analyses conducted

Two-thirds of the studies (n = 30) included in this review examined criterion validity, specifically concurrent validity (n = 22), predictive validity (n = 5), and specification validity (n = 1), and unspecified (n = 2). Over half (n = 24) studies conducted construct validity, specifically, convergent validity (n = 10), discriminant validity (n = 6), hypothesis testing validity (n = 1), predictive validity (n = 2), and factorial validity (n = 1). Eight articles did not report specific types of construct validity. While three studies conducted content validity, none reported specific type of content validity [33, 37, 48]. Additionally, 11 studies conducted validity testing for multiple measures [18, 19, 29, 33, 36, 39, 40, 45, 55, 57, 58]. Ten studies investigated construct and criterion validity of a single survey measure [18, 19, 29, 36, 39, 45, 55, 57, 58], and one study conducted construct and content validity of a single survey measure [33].

Studies conducted the following statistical analyses for testing validity of survey measures: (1) sensitivity and specificity (n = 16), (2) receiver operating characteristics (ROC) curve (n = 10), (3) correlation coefficient (n = 9), (4) Pearson correlation coefficient (n = 8), and (5) positive predicted value (PPV) (n = 8). Sensitivity and specificity were the most common statistical method for validation among studies examining construct validity and criterion validity.

Most studies showed strong evidence of validity or had strong significant associations with other measures for comparison. Studies that compared substance use measures with physician diagnoses or medical records showed strong overall validity. For example, Rosenberg et al. conducted ROC analysis for criterion validity and concluded that DALI functioned significantly better than traditional instruments for substance use disorders among psychiatric patients [43]. Compared with DAST-10, Short Inventory of Problems-Drug Use (SIP-DU) showed 100% sensitivity and 73.5% specificity for the detection for a drug use disorder. It was less sensitive at detecting self-reported current drug use (92.9%) and drug use detected by oral fluid testing or self-report (84.7%) [49]. However, studies demonstrated lack of validity for certain measures. For example:

  1. (1)

    Compared to urine screens, the ASI’s questions about drug use in the past 30 days had poor concurrent validity, which suggested that the ASI has limited validity [59].

  2. (2)

    Correlations were not statistically significant among South Shore Problem Inventory-revised (SSPI) subscales and three other substance abuse indices, such as self-related substance abuse (SRSA), quantity-frequency index (QFI) for alcohol consumption, and one-item index measuring the frequency of marijuana use [40].

  3. (3)

    Compared with oral fluid test results, using SIP-DU at a cut-off score (to be considered a positive test for alcohol screening) showed lower sensitivity and higher specificity for detecting current drug use [48].

Discussion

This systematic review found 46 studies conducted in the US between 1979 and 2021 that tested the validity of substance use/SUD measures. Two studies were population based [20, 28], while the rest were conducted in subpopulations or in clinical settings. Criterion validity and construct validity were the commonly used validation methods, and sensitivity and specificity were the most common statistical analyses for validation. More importantly, this review found that a myriad of survey measures was used to measure substance use/SUD. In addition, diverse methodologies were applied to measure validity, which makes comparability difficult. In general, most studies showed evidence of strong validity.

For example, among those articles included in this review, 46 studies tested the psychometric properties of 43 different substance use screening measures. Of them, 16 tested the validity of psychometric properties by comparing other self-reported survey measures, and one study conducted criterion validity by comparing different racial or ethnic groups of offenders [25]. Fourteen studies conducted concurrent validity by comparing measures with an external independent source or “gold standard,” such as physician/clinician diagnosis, medical records or assessment, severity scales, or urine/saliva drug testing. Frequently, researchers rely on self-reported information on substance use to save time and cost and collect required information on a larger sample size than making comparison with a gold standard, such as a biological test or medical record.

The measures used in these studies varied greatly. The ASI, which was used most frequently in this review, was used in only five studies. Additionally, three articles specifically conducted validity testing for marijuana use. However, each of those studies used many diverse measures, such as MSI-X [18], the CUDIT-R [39, 45], screen of drug use (SoDU) [60], NESARC [60], a two-item brief screen with no instrument name reported [60], and one-item index measuring the frequency of marijuana use [40]. Multiple measures for one specific substance use might increase the likelihood of conflicting results, which can make it difficult to interpret and compare results across different studies. Thus, there is a need to adopt a standardized measure to ensure the results obtained are reliable and to be able to draw general conclusions.

In addition to the diverse measures, even the validation methods employed in the articles varied greatly. Although criterion and construct validity were the most commonly utilized validity measures, the specific type of criterion or construct validity varied among studies. For example, concurrent, predictive, and specification validity were reported as the three different types of criterion validity. Some studies employed multiple validation methods for a single survey measure, while others only used one. Moreover, different types of validity may achieve different objectives, which could explain the differences in statistical analyses of validation. This review also suggested that the statistical analyses used to test the validity of survey measures were diverse, with sensitivity and specificity being the most frequent analysis. Other statistical analyses such as ROC curve and correlation coefficient were also used to validate the survey measures.

Likewise, other differences were observed for demographic characteristics of participants. First, the validation of the substance use and SUD measures was primarily conducted in either inpatient or outpatient clinical settings, and only two studies were population based. Secondly, some studies had small sample sizes, which could significantly reduce the statistical power for finding differences between study groups. Moreover, some studies were occasionally limited to certain age or race/ethnicity groups, which could adversely affect the generalizability of findings. For example, several studies were restricted to White or Black/AA participants [6, 18, 21, 24, 25, 31, 32, 36, 39, 41, 58, 59]. In addition, information on race/ethnicity was missing from a few studies [19, 22, 27, 34, 55, 57]. Those studies might reflect racial disparities in SUD, as well as treatment for SUD. Although SUD is prevalent among all racial groups, the burden of disease is disproportionate among Black people, and treatment of SUD is less available for Black people [61]. Three studies were limited to either males or females only [22, 41, 58]. These studies provide valuable validation in the respective populations and may prove useful in other populations. However, further validation is needed in diverse populations for these measures to be generalizable.

SUD often co-occurs with many other physical and mental health conditions. Previous studies have shown a high co-occurrence and the increased risk of mental health disorders among individuals with SUD, which can be observed in clinical samples [62, 63]. In this review, only five studies validated measures for both substance use and mental health disorders. Results from studies assessing substance use and mental health simultaneously can help inform integrated treatment interventions by connecting individuals with additional service providers who can provide specialized services to treat the physical and emotional elements of mental health and SUD [64]. Additional advantages of assessing co-occurring substance use and mental health include decreased hospitalization, fewer arrests, and increased housing stability [64]. More importantly, assessing co-occurring substance use and mental health disorders in population research can identify the barriers and disparities of treatment access, including race/ethnicity [65] and low treatment utilization among individuals with only substance use or only mental health disorders [66, 67].

Although this review adhered to the PRISMA guidelines, it is not without limitations. It was limited to studies conducted in the US, and studies in other countries were not included. Research shows that significant contextual differences, such as burden of substance use disorders, cultural norms, legal frameworks, healthcare systems, and societal attitudes towards substance use, can vary widely across countries, potentially influencing the reliability and applicability of measures developed and validated in one context when applied to another [1,2,3]. Our focus on US-based studies aims to ensure that the measures reviewed are relevant and applicable to the US population, providing a more accurate and context-specific assessment of substance use and SUDs.

Although a rigorous search strategy was implemented, our search was limited to library databases. As such, key clinical surveys were used in hospitals or other specialty clinical settings that were not published in peer-reviewed journals and may be missing from our review. Additionally, our objectives were to summarize the validity of measures to assess the prevalence of substance use and SUD in the US estimated in population and sub-population-based surveys. Therefore, we did not specifically review the best clinical practices for survey administration in the clinical setting. Findings highlight the need to evaluate substance use surveys in a population-based setting to identify a valid survey for use across population-based surveys. The consistent use of one survey may provide for more accurate comparisons across populations. However, the main limitation of this review is that the articles included in this review are missing information about demographic characteristics, such as the distribution of race and ethnicity groups in the study population, and only 5 studies in this review reported education level of the participants [33, 37,38,39, 49, 58]. The variation in the accuracy of self-reported data about substance use depends on education and socioeconomic status [68]. The majority of studies included in this review did not report the response rate or the survey duration. Lastly, our analyses relied only on peer-review studies, and our review did not include internal studies that may have been conducted in large surveys, such as NSDUH.

This study has several strengths. To our knowledge, it is the first systematic review to summarize the validity of substance use/SUD measures used in questionnaires or instruments among US adults. This review has included 43 years of data among nine different literature databases. In addition, it has also included “gray literature” such as theses and Google Scholar, which can make significant contributions to systematic reviews by minimizing publication bias, enabling a more impartial assessment of the evidence, and publicizing null or negative findings [69]. Another strength of the study is that the methodologic quality of validation studies was assessed by an adapted risk-of-bias tool, created especially for this assessment. Lastly, while previous reviews have explored the instruments used to assess substance use and the identification of disorders [7, 8], this review uniquely concentrates on a comprehensive evaluation of the psychometric properties of measures assessing a broader spectrum of substances. This review aimed to distinguish from previous research, highlighting the diversity and specificity of instruments in current use, their applicability in various population and sub-population surveys, and the critical need for standardized, short, and versatile measures.

The findings of this review have several key implications. The study demonstrates that survey questions can be used to assess the prevalence of SUD in specific populations. However, most studies used different measures suggesting there was no consensus on the best measure to use for assessing the prevalence of substance use and SUD. This lack of common measures illustrates the difficulty in assessing SUD in short surveys, especially for specific substances. Similar to a global measure of psychological distress that is used to indicate nonspecific psychological distress [70], a measure is needed for measuring SUD in population-based studies. Only 5 out of 46 studies were conducted in population or sub-population-based settings. Therefore, more research needs to be conducted to validate these measures in population-based settings to confirm their sensitivity and specificity. Additionally, more studies need to validate measures using a “gold standard,” such as an outside reliable measure, because comparing with self-reported substance use can result in misclassification bias. Therefore, this systematic review illustrates a critical need to develop short measures for assessing SUD that do not require lengthy, time-consuming data collection that would be difficult to incorporate into population-based surveys assessing a multitude of health dimensions.

Conclusion

This systematic review summarized the validity of measures used to assess the prevalence of substance use and SUD in the US estimated in general population surveys and other population-based settings. Among the 46 studies included, this review demonstrated that a myriad of survey measures were used to assess substance use and SUD, and diverse methodologies were used to measure validity. This information suggests a lack of standardized, comparative survey measures in assessing the prevalence of substance use and SUD among US adults. This inconsistency makes it difficult to recommend the best measures to use in US surveys and highlights the need to develop better summary measures. Very few studies in this review were conducted in general population settings, which suggests that more research is needed to validate substance use measures in such settings. Although SUD is prevalent among all racial/ethnicity, age, and gender/sex groups in the US, and studies in this review provided valuable validation in the respective populations, further validation is needed in diverse populations. Thus, future validation research needs to be conducted in population-based settings to adopt a standardized measure for substance use and SUD that can inform interventions aimed to detect and manage problems associated with substance use and SUD and prevent avoidable premature US deaths.

Availability of data and materials

Table 1 contains the extracted data, and supplementary file contains the search strategy.

Abbreviations

SUD:

Substance use disorder

US:

United States

DSM:

Diagnostic and Statistical Manual of Mental Disorders

NSDUH:

National Survey of Drug Use and Health

NAVIPPRO™:

National Addictions Vigilance Intervention and Prevention Program

ASI-MV®:

Addiction Severity Index-Multimedia Version®

DAST:

Drug Abuse Screening Test

ASSIST:

Alcohol, Smoking, and Substance Involvement Screening Test

TAPS:

Tobacco, alcohol, prescription medication, and other substance use

PRISMA:

Preferred Reporting Items for Systematic reviews and Meta-Analyses

BRFSS:

Behavioral Risk Factor Surveillance System

AA:

African American

SSI-SA:

Simple Screening Instrument for Substance Abuse

NESARC-III:

National Epidemiologic Survey on Alcohol and Related Conditions-III

PRISM-5:

Psychiatric Research Interview for Substance and Mental Disorders, DSM-5 version

PTSD:

Post-traumatic stress disorder

ASI:

Addiction Severity Index

CODSI-MD CJDAT:

Co-Occurring Disorders Screening Instruments for any Mental Disorder

CODSI-SMD CJDAT:

Co-Occurring Disorders Screening Instruments for Severe Mental Disorder

LSUR:

The Lifetime Substance Use Recall instrument

LSUR-12:

The Longitudinal Substance Use Recall for 12 Weeks instrument

SSPI:

South Shore Problem Inventory-revised

SRSA:

Self-rated substance abuse

QFI:

Quantity-frequency index for alcohol consumption

SASSI-2:

Substance Abuse Subtle Screening Inventory-2

TCUDS:

Texas Christian University Drug Screen

PCL-C:

PTSD Checklist–Civilian version

PCL-Bliese-4:

PTSD Checklist 4 Item

PCL-LS-2:

PTSD Checklist 2 Item

PCL-LS-3:

PTSD Checklist 3 Item

PCL-LS-4:

PTSD Checklist 4 Item

PCL-LS-6:

PTSD Checklist 6 Item

PC-PTSD:

Primary Care-PTSD screen

SURPS:

Substance Use Risk Profile Scale

CUAD:

Chemical Use, Abuse, and Dependence Scale

AUDADIS:

Associated Disabilities Interview Schedule

DALI:

Dartmouth Assessment of Lifestyle Instrument

SUAS:

Substance Use and Abuse Survey

CAGE:

Cut down, Annoyed, Guilty, and Eye-Opener Substance Abuse Screening Tool

CUDIT-R:

Cannabis Use Disorders Identification Test-Revised

MSI-X:

Marijuana Screening Inventory

PAI DRG:

Personality Assessment Inventory Drug Problem Scale

ROC:

Receiver operating characteristics

PPV:

Positive predicted value

SIPDU:

Short Inventory of Problems-Drug Use

SRSA:

Self-related substance abuse

QFI:

Quantity-frequency index

SoDU:

Screen of drug use

References

  1. Substance Abuse and Mental Health Services Administration. Key substance use and mental health indicators in the United States: results from the 2021 National Survey on Drug Use and Health. Rockville, MD; 2022. Report No.: HHS Publication No. PEP22–07–01–005.

  2. Substance Abuse and Mental Health Services Administration. Key substance use and mental health indicators in the United States: results from the 2021 National Survey on Drug Use and Health. Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration; 2022.

  3. Butler SF, Budman SH, Licari A, Cassidy TA, Lioy K, Dickinson J, et al. National Addictions Vigilance Intervention and Prevention Program (NAVIPPRO): a real-time, product-specific, public health surveillance system for monitoring prescription drug abuse. Pharmacoepidemiol Drug Saf. 2008;17(12):1142–54.

    Article  PubMed  Google Scholar 

  4. Kacha-Ochana A, Jones C, Green J, Dunphy C, Dailey T, Robbins R, et al. Characteristics of adults aged ≥18 years evaluated for substance use and treatment planning - United States, 2019. MMWR Morb Mortal Wkly Rep. 2022;71:749–56.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Tiet QQ, Leyva YE, Moos RH, Smith B. Diagnostic accuracy of a two-item Drug Abuse Screening Test (DAST-2). Addict Behav. 2017;74:112–7.

    Article  PubMed  Google Scholar 

  6. Carter G, Yu Z, Aryana Bryan M, Brown JL, Winhusen T, Cochran G. Validation of the tobacco, alcohol, prescription medication, and other substance use (TAPS) tool with the WHO alcohol, smoking, and substance Involvement screening test (ASSIST). Addictive Behaviors. 2022;126:107178. https://doi.org/10.1016/j.addbeh.2021.107178.

  7. Stewart RE, Cardamone NC, Schachter A, Becker C, McKay JR, Becker-Haimes EM. A systematic review of brief, freely accessible, and valid self-report measures for substance use disorders and treatment. Drug Alcohol Depend. 2023;243: 109729.

    Article  PubMed  Google Scholar 

  8. Boness CL, Carlos Gonzalez J, Sleep C, Venner KL, Witkiewitz K. Evidence-based assessment of substance use disorder. Assessment. 2023;31(1):168–90.

    Article  PubMed  Google Scholar 

  9. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, The PRISMA, et al. statement: an updated guideline for reporting systematic reviews. BMJ. 2020;2021:372.

    Google Scholar 

  10. Doyle SR, Donovan DM. A validation study of the alcohol dependence scale. J Stud Alcohol Drugs. 2009;70(5):689–99.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Tevik K, Bergh S, Selbæk G, Johannessen A, Helvik A-S. A systematic review of self-report measures used in epidemiological studies to assess alcohol consumption among older adults. PLoS ONE. 2021;16(12):e0261292.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Singh PN, Khieng S, Yel D, Nguyen D, Job JS. Validity and reliability of survey items and pictograms for use in a National Household Survey of Tobacco Use in Cambodia. Asia Pacific Journal of Public Health. 2013;25(5):45S–53S.

    Article  PubMed  Google Scholar 

  13. Szklo AS, Iglesias RM, Stoklosa M, Figueiredo VC, Welding K, de Souza Junior PRB, et al. Cross-validation of four different survey methods used to estimate illicit cigarette consumption in Brazil. Tob Control. 2022;31(1):73.

    Article  PubMed  Google Scholar 

  14. Pierannunzi C, Hu SS, Balluz L. A systematic review of publications assessing reliability and validity of the Behavioral Risk Factor Surveillance System (BRFSS), 2004–2011. BMC Med Res Methodol. 2013;13(1):49.

    Article  PubMed  PubMed Central  Google Scholar 

  15. StataCorp. Stata Statistical Software: Release 17. College Station, TX: StataCorp LLC; 2021.

  16. Wongpakaran N, Wongpakaran T, Wedding D, Gwet KL. A comparison of Cohen's Kappa and Gwet's AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples. BMC Med Res Methodol. 2013;13:61. https://doi.org/10.1186/1471-2288-13-61.

  17. Centers for Disease Control and Prevention. Methods, validity, and reliability bibliography. Selected articles related to BRFSS and other self-reported data. 2023 [Available from: https://www.cdc.gov/brfss/publications/mvr.html.

  18. Alexander D, Leung P. The Marijuana Screening Inventory (MSI-X): concurrent, convergent and discriminant validity with multiple measures. Am J Drug Alcohol Abuse. 2006;32(3):351–78.

    Article  PubMed  Google Scholar 

  19. Appleby L, Dyson V, Altman E, McGovern MP, Luchins DJ. Utility of the Chemical Use, Abuse, and Dependence Scale in screening patients with severe mental illness. Psychiatr Serv. 1996;47(6):647–9.

    Article  CAS  PubMed  Google Scholar 

  20. Boothroyd RA, Peters RH, Armstrong MI, Rynearson-Moody S, Caudy M. The psychometric properties of the Simple Screening Instrument for Substance Abuse. Eval Health Prof. 2015;38(4):538–62.

    Article  PubMed  Google Scholar 

  21. Broderick KB, Richmond MK, Fagan J, Long AW. Pilot Validation of a Brief Screen Tool for Substance Use Detection in Emergency Care. J Emerg Med. 2015;49(3):369–74.

    Article  PubMed  Google Scholar 

  22. Chasnoff IJ, Wells AM, McGourty RF, Bailey LK. Validation of the 4P’s Plus screen for substance use in pregnancy validation of the 4P’s Plus. J Perinatol. 2007;27(12):744–8.

    Article  CAS  PubMed  Google Scholar 

  23. Dennis ML, Davis JP. Screening for more with less: validation of the Global Appraisal of Individual Needs Quick v3 (GAIN-Q3) screeners. J Subst Abuse Treat. 2021;126: 108414.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Dezman ZDW, Gorelick DA, Soderstrom CA. Test characteristics of a drug CAGE questionnaire for the detection of non-alcohol substance use disorders in trauma inpatients. Injury. 2018;49(8):1538–45.

    Article  PubMed  Google Scholar 

  25. Duncan A, Sacks S, Melnick G, Cleland CM, Pearson FS, Coen C. Performance of the CJDATS Co-Occurring Disorders Screening Instruments (CODSIs) among minority offenders. Behav Sci Law. 2008;26(4):351–68.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Han BH, Sherman SE, Link AR, Wang B, McNeely J. Comparison of the Substance Use Brief Screen (SUBS) to the AUDIT-C and ASSIST for detecting unhealthy alcohol and drug use in a population of hospitalized smokers. J Subst Abuse Treat. 2017;79:67–74.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Harris AH, Ellerbe L, Phelps TE, Finney JW, Bowe T, Gupta S, et al. Examining the specification validity of the HEDIS Quality Measures for Substance Use Disorders. J Subst Abuse Treat. 2015;53:16–21.

    Article  PubMed  Google Scholar 

  28. Hasin DS, Greenstein E, Aivadyan C, Stohl M, Aharonovich E, Saha T, et al. The Alcohol Use Disorder and Associated Disabilities Interview Schedule-5 (AUDADIS-5): procedural validity of substance use disorders modules through clinical re-appraisal in a general population sample. Drug Alcohol Depend. 2015;148:40–6.

    Article  PubMed  Google Scholar 

  29. Hasin DS, Keyes KM, Alderson D, Wang S, Aharonovich E, Grant BF. Cannabis withdrawal in the United States: results from NESARC. J Clin Psychiatry. 2008;69(9):1354–63.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Hser Y, Shen H, Grella C, Anglin MD. Lifetime severity index for cocaine use disorder (LSI-Cocaine): a predictor of treatment outcomes. J Nerv Ment Dis. 1999;187(12):742–50.

    Article  CAS  PubMed  Google Scholar 

  31. Jackson CT, Covell NH, Frisman LK, Essock SM. Validity of self-reported drug use among people with co-occurring mental health and substance use disorders. J Dual Diagn. 2005;1(1):49–63.

    Article  Google Scholar 

  32. Joyner LM, Wright JD, Devine JA. Reliability and validity of the Addiction Severity Index among homeless substance misusers. Subst Use Misuse. 1996;31(6):729–51.

    Article  CAS  PubMed  Google Scholar 

  33. Kellogg SH, Ho A, Bell K, Schluger RP, McHugh PF, McClary KA, et al. The Personality Assessment Inventory Drug Problems Scale: a validity analysis. J Pers Assess. 2002;79(1):73–84.

    Article  PubMed  Google Scholar 

  34. Kupetz K, Klagsbrun M, Wisoff D, La Rosa J, Davis DI. The acceptance and validity of the Substance Use and Abuse Survey (SUAS). J Drug Educ. 1979;9(2):163–80.

    Article  Google Scholar 

  35. Leonhard C, Mulvey K, Gastfriend DR, Shwartz M. The Addiction Severity Index: a field study of internal consistency and validity. J Subst Abuse Treat. 2000;18(2):129–35.

    Article  CAS  PubMed  Google Scholar 

  36. McGovern MP, Morrison DH. The Chemical Use, Abuse, and Dependence Scale (CUAD): rationale, reliability, and validity. J Subst Abuse Treat. 1992;9(1):27–38.

    Article  CAS  PubMed  Google Scholar 

  37. McNeely J, Cleland CM, Strauss SM, Palamar JJ, Rotrosen J, Saitz R. Validation of self-administered single-item screening questions (SISQs) for unhealthy alcohol and drug use in primary care patients. J Gen Intern Med. 2015;30(12):1757–64.

    Article  PubMed  PubMed Central  Google Scholar 

  38. McNeely J, Strauss SM, Saitz R, Cleland CM, Palamar JJ, Rotrosen J, et al. A brief patient self-administered substance use screening tool for primary care: two-site validation study of the Substance Use Brief Screen (SUBS). Am J Med. 2015;128(7):784.e9–19.

    Article  PubMed  Google Scholar 

  39. Miele GM, Carpenter KM, Cockerham MS, Trautman KD, Blaine J, Hasin DS. Substance Dependence Severity Scale: reliability and validity for ICD-10 substance use disorders. Addict Behav. 2001;26(4):603–12.

    Article  CAS  PubMed  Google Scholar 

  40. O’Hare T, Cutler J, Sherrer MV, McCall TM, Dominique KN, Garlick K. Co-occurring psychosocial distress and substance abuse in community clients: initial validity and reliability of self-report measures. Community Ment Health J. 2001;37(6):481–7.

    Article  CAS  PubMed  Google Scholar 

  41. Peters RH, Greenbaum PE, Steinberg ML, Carter CR, Ortiz MM, Fry BC, et al. Effectiveness of screening instruments in detecting substance use disorders among prisoners. J Subst Abuse Treat. 2000;18(4):349–58.

    Article  CAS  PubMed  Google Scholar 

  42. Ramsay CE, Abedi GR, Marson JD, Compton MT. Overview and initial validation of two detailed, multidimensional, retrospective measures of substance use: the Lifetime Substance Use Recall (LSUR) and Longitudinal Substance Use Recall for 12 Weeks (LSUR-12) instruments. J Psychiatr Res. 2011;45(1):83–91.

    Article  PubMed  Google Scholar 

  43. Rosenberg SD, Drake RE, Wolford GL, Mueser KT, Oxman TE, Vidaver RM, et al. Dartmouth Assessment of Lifestyle Instrument (DALI): a substance use disorder screen for people with severe mental illness. Am J Psychiatry. 1998;155(2):232–8.

    Article  CAS  PubMed  Google Scholar 

  44. Salyers MP, Bosworth HB, Swanson JW, Lamb-Pagone J, Osher FC, Salyers MP, et al. Reliability and validity of the SF-12 health survey among people with severe mental illness. Med Care. 2000;38(11):1141–50.

    Article  CAS  PubMed  Google Scholar 

  45. Schultz NR, Bassett DT, Messina BG, Correia CJ. Evaluation of the psychometric properties of the cannabis use disorders identification test - revised among college students. Addict Behav. 2019;95:11–5.

    Article  PubMed  Google Scholar 

  46. Schwartz RP, McNeely J, Wu LT, Sharma G, Wahle A, Cushing C, et al. Identifying substance misuse in primary care: TAPS tool compared to the WHO ASSIST. J Subst Abuse Treat. 2017;76:69–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Smith DC, Bennett KM, Dennis ML, Funk RR. Sensitivity and specificity of the gain short-screener for predicting substance use disorders in a large national sample of emerging adults. Addict Behav. 2017;68:14–7.

    Article  PubMed  Google Scholar 

  48. Smith PC, Cheng DM, Allensworth-Davies D, Winter MR, Saitz R. Use of a single alcohol screening question to identify other drug use. Drug Alcohol Depend. 2014;139:178–80.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Smith PC, Schmidt SM, Allensworth-Davies D, Saitz R. A single-question screening test for drug use in primary care. Arch Intern Med. 2010;170(13):1155–60.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Tarter RE, Kirisci L. The Drug Use Screening Inventory for adults: psychometric structure and discriminative sensitivity. Am J Drug Alcohol Abuse. 1997;23(2):207–19.

    Article  CAS  PubMed  Google Scholar 

  51. Tiet QQ, Leyva Y, Moos RH, Smith B. Diagnostic accuracy of a two-item screen for drug use developed from the alcohol, smoking and substance involvement screening test (ASSIST). Drug Alcohol Depend. 2016;164:22–7.

    Article  PubMed  Google Scholar 

  52. Tiet QQ, Leyva YE, Browne K, Moos RH. Screen of drug use: diagnostic accuracy for cannabis use disorder. Addict Behav. 2019;95:184–8.

    Article  PubMed  Google Scholar 

  53. Tiet QQ, Leyva YE, Moos RH, Frayne SM, Osterberg L, Smith B. Screen of drug use: diagnostic accuracy of a New Brief Tool for Primary Care. JAMA Intern Med. 2015;175(8):1371–7.

    Article  PubMed  Google Scholar 

  54. Tiet QQ, Schutte KK, Leyva YE. Diagnostic accuracy of brief PTSD screening instruments in military veterans. J Subst Abuse Treat. 2013;45(1):134–42.

    Article  PubMed  Google Scholar 

  55. Westermeyer J, Crosby R, Nugent S. The Minnesota Substance Abuse Problems Scale Psychometric analysis and validation in a clinical population. Am J Addict. 1998;7(1):24–34.

    CAS  PubMed  Google Scholar 

  56. Wickersham JA, Azar MM, Cannon CM, Altice FL, Springer SA. Validation of a brief measure of opioid dependence: the Rapid Opioid Dependence Screen (RODS). J Correct Health Care. 2015;21(1):12–26.

    Article  PubMed  Google Scholar 

  57. Woicik PA, Stewart SH, Pihl RO, Conrod PJ. The Substance Use Risk Profile Scale: a scale measuring traits linked to reinforcement-specific substance use profiles. Addict Behav. 2009;34(12):1042–55.

    Article  PubMed  Google Scholar 

  58. Zanis DA, McLellan AT, Cnaan RA, Randall M. Reliability and validity of the Addiction Severity Index with a homeless sample. J Subst Abuse Treat. 1994;11(6):541–8.

    Article  CAS  PubMed  Google Scholar 

  59. Zanis DA, McLellan AT, Corse S. Is the Addiction Severity Index a reliable and valid assessment instrument among clients with severe and persistent mental illness and substance abuse disorders? Community Ment Health J. 1997;33(3):213–27.

    Article  CAS  PubMed  Google Scholar 

  60. Broderick KB, Kaplan B, Martini D, Caruso E. Emergency physician utilization of alcohol/substance screening, brief advice and discharge: a 10-year comparison. J Emerg Med. 2015;49(4):400–7.

    Article  PubMed  Google Scholar 

  61. Farahmand P, Arshed A, Bradley MV. Systemic racism and substance use disorders. Psychiatr Ann. 2020;50:494–8.

    Article  Google Scholar 

  62. Lieb M, Wittchen H-U, Palm U, Apelt SM, Siegert J, Soyka M. Psychiatric comorbidity in substitution treatment of opioid-dependent patients in primary care: prevalence and impact on clinical features. Heroin Addiction and related clinical problems. 2010;12(4):5–16.

    Google Scholar 

  63. Kessler RC, Chiu WT, Demler O, Merikangas KR, Walters EE. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Arch Gen Psychiatry. 2005;62(6):617–27.

    Article  PubMed  PubMed Central  Google Scholar 

  64. Substance Abuse and Mental Health Services Administration. The case for screening and treatment of co-occurring disorders 2022 [Available from: https://www.samhsa.gov/co-occurring-disorders.

  65. Hatzenbuehler ML, Keyes KM, Narrow WE, Grant BF, Hasin DS. Racial/ethnic disparities in service utilization for individuals with co-occurring mental health and substance use disorders in the general population: results from the national epidemiologic survey on alcohol and related conditions. J Clin Psychiatry. 2008;69(7):1112–21.

    Article  PubMed  PubMed Central  Google Scholar 

  66. Curran GM, Sullivan G, Williams K, Han X, Collins K, Keys J, et al. Emergency department use of persons with comorbid psychiatric and substance abuse disorders. Ann Emerg Med. 2003;41(5):659–67.

    Article  PubMed  Google Scholar 

  67. Verduin ML, Carter RE, Brady KT, Myrick H, Timmerman MA. Health service use among persons with comorbid bipolar and substance use disorders. Psychiatr Serv. 2005;56(4):475–80.

    Article  PubMed  Google Scholar 

  68. Hunt DE, Kling R, Almozlino Y, Jalbert S, Chapman MT, Rhodes W. Telling the Truth About Drug Use: How Much Does It Matter? Journal of Drug Issues. 2015;45(3):314–29.

    Article  Google Scholar 

  69. Benzies KM, Premji S, Hayden KA, Serrett K. State-of-the-evidence reviews: advantages and challenges of including grey literature. Worldviews Evid Based Nurs. 2006;3(2):55–61.

    Article  PubMed  Google Scholar 

  70. Kessler RC, Andrews G, Colpe LJ, Hiripi E, Mroczek DK, Normand SL, et al. Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychol Med. 2002;32(6):959–95.

    Article  CAS  PubMed  Google Scholar 

Download references

Funding

The review is being funded by the West Virginia Department of Health and Human Resources (WVDHHR).

Author information

Authors and Affiliations

Authors

Contributions

YT and RB are responsible for research conception, design, and coordination of entire manuscript; YT is responsible for collection of data, literature database search, and article retrieval, for writing the initial manuscript, and for revising the manuscript; YT, EC, RM, NW, and EO are responsible for the literature review and assessment and interpretation of results and for reviewing and revising manuscripts; and GS, SDH, and RB are responsible for interpretation of results and for reviewing and revising manuscripts. All authors provided critical feedback to the manuscript and approved the final manuscript draft for submission.

Financial disclosure statement

No authors have financial relationships relevant to this article to disclose.

Corresponding author

Correspondence to Ruchi Bhandari.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Supplementary Table 1. Search term list for each database. Figure 2. Bar Graph of Survey Measures Validated by Included Studies

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Y., Caswell, E., Mohamed, R. et al. A systematic review of validity of US survey measures for assessing substance use and substance use disorders. Syst Rev 13, 166 (2024). https://doi.org/10.1186/s13643-024-02536-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13643-024-02536-x

Keywords