
RESEARCH LIBRARY
RESEARCH LIBRARY
View the latest publications from members of the NBME research team
Advances in Health Sciences Education
Recent advancements enable replacing MCQs with SAQs in high-stakes assessments, but prior research often used small samples under low stakes and lacked time data. This study assesses difficulty, discrimination, and time in a large-scale high-stakes context
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), Pages 443-447
This paper presents the ACTA system, which performs automated short-answer grading in the domain of high-stakes medical exams. The system builds upon previous work on neural similarity-based grading approaches by applying these to the medical domain and utilizing contrastive learning as a means to optimize the similarity metric.
Advancing Natural Language Processing in Educational Assessment
This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.
ACM Transactions on Accessible Computing: Volume 16 - Issue 1, Pages 1–2
This article presents an introduction to the special issue on Augmentative and Alternative Communication (AAC).
Journal of Graduate Medical Education: Volume 14, Issue 6, Pages 634-638
This article discusses recent recommendations from the UME-GME Review Committee (UGRC) to address challenges in the UME-GME transition—including complexity, negative impact on well-being, costs, and inequities.
Research in Nursing & Health: Volume 46, Issue 1, Pages 127-135
As interest in supporting new nurse practitioners' (NPs) transition to practice increases, those interested in measuring the concept will need an instrument with evidence of reliability and validity. The Novice NP Role Transition (NNPRT) Scale is the first instrument to measure the concept. Using a cross-sectional design and data from 210 novice NPs, the purpose of this study was to confirm the NNPRT Scale's internal factor structure via confirmatory factor analysis (CFA).
Academic Medicine: Volume 98 - Issue 2 - Pages 180-187
This article describes the work of the Coalition for Physician Accountability’s Undergraduate Medical Education to Graduate Medical Education Review Committee (UGRC) to apply a quality improvement approach and systems thinking to explore the underlying causes of dysfunction in the undergraduate medical education (UME) to graduate medical education (GME) transition.
Teaching and Learning in Medicine: Volume 33 - Issue 4 - p 366-381
The purpose of this analysis is to describe these sources of evidence that can be used to evaluate the quality of generated items. The important role of medical expertise in the development and evaluation of the generated items is highlighted as a crucial requirement for producing validation evidence.
Applied Psychological Measurement: Volume 46, issue 6, page(s) 529-547
The current simulation study demonstrated that the sampling variance associated with the item response theory (IRT) item parameter estimates can help detect outliers in the common items under the 2-PL and 3-PL IRT models. The results showed the proposed sampling variance statistic (SV) outperformed the traditional displacement method with cutoff values of 0.3 and 0.5 along a variety of evaluation criteria.
Educational Measurement: Issues and Practices: Volume 41 - Issue 1 - Pages 95-96
Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. This module discusses some of these unusual challenges that usually occur in a credentialing program.