
RESEARCH LIBRARY
RESEARCH LIBRARY
View the latest publications from members of the NBME research team
Advancing Natural Language Processing in Educational Assessment
This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.
Advancing Natural Language Processing in Educational Assessment: Pages 167-182
This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.
Advancing Natural Language Processing in Educational Assessment: Pages 58-73
This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.
Integrating Timing Considerations to Improve Testing Practices
This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.
Integrating Timing Considerations to Improve Testing Practices
This book synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring.
Integrating Timing Considerations to Improve Testing Practices
This chapter addresses timing considerations in the context of other types of performance assessments and reports on a previously unpublished experiment examining timing with respect to performance on computer-based case simulations that are used in physician licensure.
Integrating Timing Considerations to Improve Testing Practices
This chapter presents a historical overview of the testing literature that exemplifies the theoretical and operational evolution of test speededness.
Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229
This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.
Springer International Publishing; 2019
This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.
On the Cover. Educational Measurement: Issues and Practice, 38: 5-5
This informative graphic reports between‐individual information where a vertical line—with dashed lines on either side indicating an error band—spans three graphics allowing a student to easily see their score relative to four defined performance categories and, more notably, three relevant score distributions.