
RESEARCH LIBRARY
RESEARCH LIBRARY
View the latest publications from members of the NBME research team
Academic Medicine: Volume 99 - Issue 7 - Pages 778-783
This study examined score comparability between in-person and remote proctored administrations of the 2020 Internal Medicine In-Training Examination (IM-ITE) during the COVID-19 pandemic. Analysis of data from 27,115 IM residents revealed statistically significant but educationally nonsignificant differences in predicted scores, with slightly larger variations observed for first-year residents. Overall, performance did not substantially differ between the two testing modalities, supporting the continued use of remote proctoring for the IM-ITE amidst pandemic-related disruptions.
Applied Measurement Education: Volume 36, Issue 4, Pages 326-339
This study examines strategies for detecting parameter drift in small-sample equating, crucial for maintaining score comparability in high-stakes exams. Results suggest that methods like mINFIT, mOUTFIT, and Robust-z effectively mitigate drifting anchor items' effects, while caution is advised with the Logit Difference approach. Recommendations are provided for practitioners to manage item parameter drift in small-sample settings.
Educational Assessment
This study proposes four indices to quantify item influence and distinguishes them from other available item and test measures. We use simulation methods to evaluate and provide guidelines for interpreting each index, followed by a real data application to illustrate their use in practice. We discuss theoretical considerations regarding when influence presents a psychometric concern and other practical concerns such as how the indices function when reducing influence imbalance.
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), Pages 443-447
This paper presents the ACTA system, which performs automated short-answer grading in the domain of high-stakes medical exams. The system builds upon previous work on neural similarity-based grading approaches by applying these to the medical domain and utilizing contrastive learning as a means to optimize the similarity metric.
Advancing Natural Language Processing in Educational Assessment: Pages 167-182
This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.
Advancing Natural Language Processing in Educational Assessment: Pages 58-73
This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.
ACM Transactions on Accessible Computing: Volume 16 - Issue 1, Pages 1–2
This article presents an introduction to the special issue on Augmentative and Alternative Communication (AAC).
Essays on Contemporary Psychometrics: Pages 163-180
This paper shows that using non-linear functions for equating and score transformations leads to consequences that are not commensurable with classical test theory (CTT). More specifically, a well-known theorem from calculus shows that the expected value of a non-linearly transformed variable does not equal the transformed expected value of this variable.
Journal of Graduate Medical Education: Volume 14, Issue 6, Pages 634-638
This article discusses recent recommendations from the UME-GME Review Committee (UGRC) to address challenges in the UME-GME transition—including complexity, negative impact on well-being, costs, and inequities.
Research in Nursing & Health: Volume 46, Issue 1, Pages 127-135
As interest in supporting new nurse practitioners' (NPs) transition to practice increases, those interested in measuring the concept will need an instrument with evidence of reliability and validity. The Novice NP Role Transition (NNPRT) Scale is the first instrument to measure the concept. Using a cross-sectional design and data from 210 novice NPs, the purpose of this study was to confirm the NNPRT Scale's internal factor structure via confirmatory factor analysis (CFA).