Skip to main content

Table 3 Criteria for good psychometric properties adapted from Prinsen et al. [43]

From: Protocol for a systematic review exploring the psychometric properties of self-report health-related quality of life and subjective wellbeing measures used by adolescents with intellectual disabilities

Measurement property

Ratinga

Criteria

Structural validity

+

CTT

• CFA: CFI or TLI or comparable measure > 0.95 OR RMSEA

• < 0.06 OR SRMR < 0.082

IRT/Rasch

No violation of unidimensionalityc: CFI or TLI or comparable measure > 0.95 OR RMSEA < 0.06V OR SRMR < 0.08b

AND

No violation of local independence: residual correlations amongthe items after controlling for the dominant factor <

0.20 ORQ3s < 0.37

AND

no violation of monotonicity: adequate looking graphs OR item scalability > 0.30

AND

Adequate model fit

• IRT: χ2 >0 .01

• Rasch: infit and outfit mean squares ≥ 0.5 and ≤ 1.5 OR Z-standardized values > −2

?

CTT: not all information for ‘+’ reported

IRT/Rasch: model fit not reported

Criteria for ‘+’ not met

Internal consistency

+

At least low evidenced for sufficient structural validitye AND Cronbach’s alpha(s) ≥ 0.70 for each unidimensional scale or subscalef

?

Criteria for ‘At least low evidenced for sufficient structural validitye’ not met

At least low evidenced for sufficient structural validity5 AND Cronbach’s alpha(s) < 0.70 for each unidimensional scale or subscalef

Reliability

+

ICC or weighted kappa ≥ 0.70

?

ICC or weighted kappa not reported

ICC or weighted kappa < 0.70

Measurement error

+

?

SDC or LoA < MICe

MIC not defined

SDC or LoA > MICe

Hypotheses testing for construct validity

+

The result is in accordance with the hypothesisg

?

No hypothesis defined (by the review team)

The result is not in accordance with the hypothesisg

Cross-cultural validity/measurement invariance

+

No important differences found between group factors (such as age, gender, language) in multiple group factor analysis OR no important DIF for group factors (McFadden’s R2 < 0.02)

?

No multiple group factor analysis OR DIF analysis performed

Important differences between group factors OR DIF was found

Criterion validity

+

Correlation with gold standard ≥ 0.70 OR AUC ≥ 0.70

?

Not all information for ‘+’ reported

Correlation with gold standard < 0.70 OR AUC < 0.70

Responsiveness

+

The result is in accordance with the hypothesisg OR AUC ≥ 0.70

?

No hypothesis defined (by the review team)

The result is not in accordance with the hypothesisg OR AUC < 0.70

  1. The criteria are based on, e.g. Terwee et al. [56] and Prinsen et al. [44]. AUC, area under the curve; CFA, confirmatory factor analysis; CFI, comparative fit index; CTT, classical test theory; DIF, differential item functioning; ICC, intraclass correlation coefficient; IRT, item response theory; LoA, limits of agreement; MIC, minimal important change; RMSEA, root-mean-square error of approximation; SEM, standard error of measurement; SDC, smallest detectable change; SRMR, standardized root mean residuals; TLI, Tucker-Lewis index. a‘+,’ sufficient; ‘–,’ insufficient, ‘ ?,’ indeterminate. bTo rate the quality of the summary score, the factor structures should be equal across studies. cUnidimensionality refers to a factor analysis per subscale, while structural validity refers to a factor analysis of a multidimensional patient-reported outcome measure. dAs defined by grading the evidence according to the GRADE approach. eThis evidence may come from different studies. fThe criteria ‘Cronbach alpha < 0.95’ was deleted, as this is relevant in the development phase of a PROM and not when evaluating an existing PROM. gThe results of all studies should be taken together, and it should then be decided if 75% of the results are in accordance with the hypotheses