Table 6 Details of studies which have used and validated six other tools identified as lower quality by this review for use in evaluating EBM teaching in medical education

From: A systematic review and taxonomy of tools for evaluating evidence-based medicine teaching in medical education

Source instrument name and dateInstrument development, number of participants, level of expertiseEBM learning domainsInstrument descriptionEBM stepsPsychometric properties with results of validity and reliability assessment
Educational Prescription-David Feldstein [19]20 residentsKnowledge and skillsEducat academic GPs or clinical ional prescription (EP)—web-based tool that guides learners through the four As of EBM. Learners use the EP to define a clinical question, document a search strategy, appraise the evidence, report the results and apply evidence to the particular patientAsking, acquiring, appraising, applyingPredictive validity
Interrater reliability
Interrater reliability on the 20 EPs showed fair agreement for question formation (k= 0.22); moderate agreement for overall competence (k = 0.57) and evaluation of evidence (k= 0.44). and substantial agreement for searching (k = 0.70) and application of evidence (k = 0.72)
BACES-Barlow [23]Yes
postgraduate medical trainees/residents—150 residents
Knowledge, skillsBACES-Biostatistics and Clinical Epidemiology Skills (BACES) assessment for medical residents-30 multiple-choice questions were written to focus on interpreting clinical epidemiological and statistical methodsAppraisal—interpreting clinical epidemiology and statistical methodsContent validity was assessed through a four person expert review
Item Response Theory (IRT) makes it flexible to use subsets of questions for other cohorts of residents (novice, intermediate and advanced).
26 items fit into a two parameter logistic IRT model and correlated well with their comparable CTT (classical test theory) values
David Feldstein-EBM test [18]48 internal medicine residentsKnowledge and skillsEBM test—25 mcqs-covering seven EBM focus areas: (a) asking clinical questions, (b) searching, (c) EBM resources, (d) critical appraisal of therapeutic and diagnostic evidence, (e) calculating ARR, NNT and RRR, (f) interpreting diagnostic test results and (g) interpreting confidence intervalsAsking, acquiring and appraising
Asking clinical questions, searching, EBM resources, critical appraisal, calculations of ARR, NNT, RRR, interpreting diagnostic test results and interpreting confidence intervals.
Construct validity
Responsive validity
EBM experts scored significantly higher EBM test scores compared to PGY-1 residents (p < 0.001), who in turn scored higher than 1st year students (p < 0.004). Responsiveness of the test was also demonstrated with 16 practising clinicians—mean difference in fellows’ pre-test to post-test EBM scores was 5.8 points (95% CI 4.2, 7.4)
Frohna-OSCE [22]Medical students (n-26) who tried the paper-based test during the pilot phase. A web-based station was then developed for full implementation (n = 140).SkillsA web-based 20-min OSCE-specific case scenario where students asked a structural clinical question, generated effective MEDLINE search terms and elected the most appropriate of 3 abstractsAsk, acquire, appraise and applyFace validity
Interrater reliability
Literature review and expert consensus
Between three scorers, there was good interrater reliability with 84, 94 and 96% agreement (k = 0.64, 0.82 and 0.91)
Tudiver-OSCE [21]Residents—first year and second yearSkillsOSCE stationsAsk, acquire, appraise and applyContent validity
Construct validity p= 0.43
Criterion validity p < 0.001
Interrater reliability ICC 0.96
Internal reliability Cronbach’s alpha 0.58
Mendiola-mcq [20]Fifth year medical studentsKnowledgeMCQ (100 questions)AppraiseReliability of the mcq = Cronbach’s alpha 0.72 in M5 and 0.83 in M6 group
Effect size in Cohen’s d for the knowledge score main outcome comparison of M5 EBM vs M5 non-EBM was 3.54
  1. mcq multiple choice question, OSCE objective structured clinical examination, ICC intraclass correlation, NNT number needed to treat, ARR attributable risk ratio, RRR relative risk ratio
  2. Assessment aims: formative