Academic Medicine: Volume 99 - Issue 3 - p 325-330
This retrospective cohort study investigates the association between United States Medical Licensing Examination (USMLE) scores and outcomes in 196,881 hospitalizations in Pennsylvania over 3 years.
Advancing Natural Language Processing in Educational Assessment
This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.
Medical Teacher: Volume 45 - Issue 6, Pages 565-573
This guide aims aim to describe practical considerations involved in reading and conducting studies in medical education using Artificial Intelligence (AI), define basic terminology and identify which medical education problems and data are ideally-suited for using AI.
Research in Nursing & Health: Volume 46, Issue 1, Pages 127-135
As interest in supporting new nurse practitioners' (NPs) transition to practice increases, those interested in measuring the concept will need an instrument with evidence of reliability and validity. The Novice NP Role Transition (NNPRT) Scale is the first instrument to measure the concept. Using a cross-sectional design and data from 210 novice NPs, the purpose of this study was to confirm the NNPRT Scale's internal factor structure via confirmatory factor analysis (CFA).
Academic Medicine: June 2022
This study examines the associations between Step 3 scores and subsequent receipt of disciplinary action taken by state medical boards for problematic behavior in practice. It analyzes Step 3 total, Step 3 computer-based case simulation (CCS), and Step 3multiple-choice question (MCQ) scores.
Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40
The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.
Journal of Educational Measurement: Volume 58, Issue 4, Pages 515-537
In this paper, the NBME team reports the results an eye-tracking study designed to evaluate how the presence of the options in multiple-choice questions impacts the way medical students responded to questions designed to evaluate clinical reasoning. Examples of the types of data that can be extracted are presented. We then discuss the implications of these results for evaluating the validity of inferences made based on the type of items used in this study.
Advances in Health Sciences Education: Volume 25, p 1057–1086 (2020)
This critical review explores: (1) published applications of data science and ML in HPE literature and (2) the potential role of data science and ML in shifting theoretical and epistemological perspectives in HPE research and practice.
Evaluation & the Health Professions: Volume: 43 issue: 3, page(s): 149-158
This study examines the innovative and practical application of DCM framework to health professions educational assessments using retrospective large-scale assessment data from the basic and clinical sciences: National Board of Medical Examiners Subject Examinations in pathology (n = 2,006) and medicine (n = 2,351).
Integrating Timing Considerations to Improve Testing Practices
This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.