RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 10 of 19 Research Library Publications

Examining ChatGPT Performance on USMLE Sample Items and Implications for Assessment

Posted: February 1, 2024 | Victoria Yaneva, Peter Baldwin, Daniel P. Jurich, Kimberly Swygert, Brian E. Clauser

Academic Medicine: Volume 99 - Issue 2 - p 192-197

This report investigates the potential of artificial intelligence (AI) agents, exemplified by ChatGPT, to perform on the United States Medical Licensing Examination (USMLE), following reports of its successful performance on sample items.

Category:Product-Oriented Research, USMLE, Assessment-Oriented Research, Applications of Technology

Advancing Natural Language Processing in Educational Assessment

Posted: June 5, 2023 | Victoria Yaneva (editor), Matthias von Davier (editor)

Advancing Natural Language Processing in Educational Assessment

This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.

Category:Assessment-Oriented Research, Applications of Technology, General Measurement

Extracting Linguistic Signal From Item Text and Its Application to Modeling Item Characteristics

Posted: June 5, 2023 | Victoria Yaneva, Peter Baldwin, Le An Ha, Christopher Runyon

Advancing Natural Language Processing in Educational Assessment: Pages 167-182

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Assessment of Clinical Skills: A Case Study in Constructing an NLP-based Scoring System for Patient Notes

Posted: June 5, 2023 | Polina Harik, Janet Mee, Christopher Runyon, Brian E. Clauser

Advancing Natural Language Processing in Educational Assessment: Pages 58-73

This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.

Category:Assessment-Oriented Research, Applications of Technology, Scoring

Similarities Between Clinically Matched and Unmatched Analogue Patient Raters: A Mixed Methods Study

Posted: April 1, 2023 | Ann King, Kathleen Mazor, Andrew Houriet, Thea Musselman, Ruth Hoppe, Angelo D’Addario

Patient Education and Counseling: Volume 109, Supplement, April 2023, Page 2

Physicians' responses to patient communication were assessed by both clinically matched and unmatched analogue patients (APs). Significant correlations between their ratings indicated consistency in evaluating physician communication skills. Thematic analysis identified twenty-one common themes in both clinically matched and unmatched AP responses, suggesting similar assessments of important behaviors. These findings imply that clinically unmatched APs can effectively substitute for clinically matched ones in evaluating physician communication and offering feedback when the latter are unavailable.

Category:Assessment-Oriented Research, Applications of Technology, Links to Outcomes

The Fundamentals of Artificial Intelligence in Medical Education Research: AMEE Guide No. 156

Posted: March 2, 2023 | Martin G. Tolsgaard, Martin V. Pusic, Stefanie S. Sebok-Syer, Brian Gin, Morten Bo Svendsen, Mark D. Syer, Ryan Brydges, Monica M. Cuddy, Christy K. Boscardin

Medical Teacher: Volume 45 - Issue 6, Pages 565-573

This guide aims aim to describe practical considerations involved in reading and conducting studies in medical education using Artificial Intelligence (AI), define basic terminology and identify which medical education problems and data are ideally-suited for using AI.

Category:Assessment-Oriented Research, Applications of Technology

Reading Differences in Eye-Tracking Data as a Marker of High-Functioning Autism in Adults and Comparison to Results from Web-Related Tasks

Posted: January 27, 2023 | Victoria Yaneva, Le An Ha, Sukru Eraslan, Yeliz Yesilada, Ruslan Mitkov

Neural Engineering Techniques for Autism Spectrum Disorder: Volume 2, Pages 63-79

Automated detection of high-functioning autism in adults is a highly challenging and understudied problem. In search of a way to automatically detect the condition, this chapter explores how eye-tracking data from reading tasks can be used.

Category:Health Professions, Assessment-Oriented Research, Applications of Technology

“Cephalgia” or “Migraine”? Solving the Headache of Assessing Clinical Reasoning Using Natural Language Processing

Posted: November 21, 2022 | Christopher Runyon, Polina Harik, Michael Barone

Diagnosis: Volume 10, Issue 1, Pages 54-60

This op-ed discusses the advantages of leveraging natural language processing (NLP) in the assessment of clinical reasoning. It also provides an overview of INCITE, the Intelligent Clinical Text Evaluator, a scalable NLP-based computer-assisted scoring system that was developed to measure clinical reasoning ability as assessed in the written documentation portion of the now-discontinued USMLE Step 2 Clinical Skills examination.

Category:Product-Oriented Research, USMLE, Assessment-Oriented Research, Applications of Technology

Effects of Data Preprocessing on Detecting Autism in Adults Using Web-Based Eye-Tracking Data

Posted: September 14, 2022 | Erfan Khalaji, Sukru Eraslan, Yeliz Yesilada, Victoria Yaneva

Behavior & Information Technology

This study builds upon prior work in this area that focused on developing a machine-learning classifier trained on gaze data from web-related tasks to detect ASD in adults. Using the same data, we show that a new data pre-processing approach, combined with an exploration of the performance of different classification algorithms, leads to an increased classification accuracy compared to prior work.

Category:Assessment-Oriented Research, Applications of Technology

Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory

Posted: April 29, 2022 | Andrew A. White, Ann M. King, Angelo E. D’Addario, Karen Berg Brigham, Suzanne Dintzis, Emily E. Fay, Thomas H. Gallagher, Kathleen M. Mazor

JMIR Medical Education: Volume 8 - Issue 2 - e30988

This article aims to compare the reliability of two assessment groups (crowdsourced laypeople and patient advocates) in rating physician error disclosure communication skills using the Video-Based Communication Assessment app.

Category:Assessment-Oriented Research, Applications of Technology

NBME Self-Assessment Bundles

Stay Up to Date

Stay Up to Date

New Psychometric Workshops

INSIGHTS® Demo

Open Grant Opportunities

RESEARCH LIBRARY

Filter:

Examining ChatGPT Performance on USMLE Sample Items and Implications for Assessment

Advancing Natural Language Processing in Educational Assessment

Extracting Linguistic Signal From Item Text and Its Application to Modeling Item Characteristics

Assessment of Clinical Skills: A Case Study in Constructing an NLP-based Scoring System for Patient Notes

Similarities Between Clinically Matched and Unmatched Analogue Patient Raters: A Mixed Methods Study

The Fundamentals of Artificial Intelligence in Medical Education Research: AMEE Guide No. 156

Reading Differences in Eye-Tracking Data as a Marker of High-Functioning Autism in Adults and Comparison to Results from Web-Related Tasks

“Cephalgia” or “Migraine”? Solving the Headache of Assessing Clinical Reasoning Using Natural Language Processing

Effects of Data Preprocessing on Detecting Autism in Adults Using Web-Based Eye-Tracking Data

Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory