RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 6 of 6 Research Library Publications

Extracting Linguistic Signal From Item Text and Its Application to Modeling Item Characteristics

Posted: June 5, 2023 | Victoria Yaneva, Peter Baldwin, Le An Ha, Christopher Runyon

Advancing Natural Language Processing in Educational Assessment: Pages 167-182

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Assessment of Clinical Skills: A Case Study in Constructing an NLP-based Scoring System for Patient Notes

Posted: June 5, 2023 | Polina Harik, Janet Mee, Christopher Runyon, Brian E. Clauser

Advancing Natural Language Processing in Educational Assessment: Pages 58-73

This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Leveraging Machine Learning Technology to Improve Accuracy and Efficiency of Identification of Enemy Item Pairs

Posted: January 1, 2022 | Ian Micir, Kimberly Swygert, Jean D'Angelo

Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40

The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.

Category:Assessment-Oriented Research, Scoring, Applications of Technology

Automated Prediction Of Examinee Proficiency From Short-Answer Questions

Posted: December 10, 2020 | Le An Ha, Victoria Yaneva, Polina Harik, Ravi Pandian, Amy Morales, Brian Clauser

Proceedings of the 28th International Conference on Computational Linguistics

This paper brings together approaches from the fields of NLP and psychometric measurement to address the problem of predicting examinee proficiency from responses to short-answer questions (SAQs).

Category:Assessment-Oriented Research, Scoring

Leveraging Natural Language Processing: Toward Computer-Assisted Scoring of Patient Notes in the USMLE Step 2 Clinical Skills Exam

Posted: March 1, 2019 | J. Salt, P. Harik, M. A. Barone

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316

The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).

Category:Assessment-Oriented Research, Scoring, Applications of Technology, Product-Oriented Research, USMLE

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

Posted: June 1, 2018 | P. Harik, B. E. Clauser, I. Grabovsky, P. Baldwin, M. Margolis, D. Bucak, M. Jodoin, W. Walsh, S. Haist

Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327

The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring, Product-Oriented Research, USMLE

NBME Self-Assessment Bundles

Stay Up to Date

Stay Up to Date

New Psychometric Workshops

INSIGHTS® Demo

Open Grant Opportunities