RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 10 of 16 Research Library Publications

Advancing Natural Language Processing in Educational Assessment

Posted: June 5, 2023 | Victoria Yaneva (editor), Matthias von Davier (editor)

Advancing Natural Language Processing in Educational Assessment

This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.

Category:Assessment-Oriented Research, Applications of Technology, General Measurement

Extracting Linguistic Signal From Item Text and Its Application to Modeling Item Characteristics

Posted: June 5, 2023 | Victoria Yaneva, Peter Baldwin, Le An Ha, Christopher Runyon

Advancing Natural Language Processing in Educational Assessment: Pages 167-182

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Assessment of Clinical Skills: A Case Study in Constructing an NLP-based Scoring System for Patient Notes

Posted: June 5, 2023 | Polina Harik, Janet Mee, Christopher Runyon, Brian E. Clauser

Advancing Natural Language Processing in Educational Assessment: Pages 58-73

This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Integrating Timing Considerations to Improve Testing Practices

Posted: June 25, 2020 | M.J. Margolis, R.A. Feinberg (eds)

Integrating Timing Considerations to Improve Testing Practices

This book synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring.

Category:Assessment-Oriented Research, General Measurement

A History of Test Speededness: Tracing the Evolution of Theory and Practice

Posted: June 25, 2020 | D. Jurich

Integrating Timing Considerations to Improve Testing Practices

This chapter presents a historical overview of the testing literature that exemplifies the theoretical and operational evolution of test speededness.

Category:Assessment-Oriented Research, General Measurement, Reliability/Validity

Examining the Precision of Cut Scores Within a Generalizability Theory Framework: A Closer Look at the Item Effect

Posted: June 3, 2020 | B. E. Clauser, M. Kane, J. C. Clauser

Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229

This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.

Category:Assessment-Oriented Research, Reliability/Validity

Handbook of Diagnostic Classification Models

Posted: August 31, 2019 | M. von Davier, YS. Lee

Springer International Publishing; 2019

This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.

Category:Assessment-Oriented Research, General Measurement, Scoring

Visualizing Hierarchical Score Inferences

Posted: June 6, 2019 | R.A. Feinberg, D.P Jurich

On the Cover. Educational Measurement: Issues and Practice, 38: 5-5

This informative graphic reports between‐individual information where a vertical line—with dashed lines on either side indicating an error band—spans three graphics allowing a student to easily see their score relative to four defined performance categories and, more notably, three relevant score distributions.

Category:Assessment-Oriented Research, Scoring

Moving the United States Medical Licensing Examination Step 1 After Core Clerkships: An Outcomes Analysis

Posted: March 1, 2019 | D. Jurich, M. Daniel, M. Paniagua, A. Fleming, V. Harnik, A. Pock, A. Swan-Sein, M. A. Barone, S.A. Santen

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 371-377

Schools undergoing curricular reform are reconsidering the optimal timing of Step 1. This study provides a psychometric investigation of the impact on United States Medical Licensing Examination Step 1 scores of changing the timing of Step 1 from after completion of the basic science curricula to after core clerkships.

Category:Product-Oriented Research, USMLE

Leveraging Natural Language Processing: Toward Computer-Assisted Scoring of Patient Notes in the USMLE Step 2 Clinical Skills Exam

Posted: March 1, 2019 | J. Salt, P. Harik, M. A. Barone

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316

The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).

Category:Assessment-Oriented Research, Scoring, Applications of Technology, Product-Oriented Research, USMLE

Stay Up to Date

USMLE® Fee Assistance

Communication Learning Assessment

New Psychometric Workshops

NBME Academy

Latin America Grants