RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 8 of 8 Research Library Publications

“Cephalgia” or “Migraine”? Solving the Headache of Assessing Clinical Reasoning Using Natural Language Processing

Posted: November 21, 2022 | Christopher Runyon, Polina Harik, Michael Barone

Diagnosis: Volume 10, Issue 1, Pages 54-60

This op-ed discusses the advantages of leveraging natural language processing (NLP) in the assessment of clinical reasoning. It also provides an overview of INCITE, the Intelligent Clinical Text Evaluator, a scalable NLP-based computer-assisted scoring system that was developed to measure clinical reasoning ability as assessed in the written documentation portion of the now-discontinued USMLE Step 2 Clinical Skills examination.

Category:Product-Oriented Research, USMLE, Assessment-Oriented Research, Applications of Technology

Effects of Data Preprocessing on Detecting Autism in Adults Using Web-Based Eye-Tracking Data

Posted: September 14, 2022 | Erfan Khalaji, Sukru Eraslan, Yeliz Yesilada, Victoria Yaneva

Behavior & Information Technology

This study builds upon prior work in this area that focused on developing a machine-learning classifier trained on gaze data from web-related tasks to detect ASD in adults. Using the same data, we show that a new data pre-processing approach, combined with an exploration of the performance of different classification algorithms, leads to an increased classification accuracy compared to prior work.

Category:Assessment-Oriented Research, Applications of Technology

Uncovering the Complexity of Item Position Effects in a Low-Stakes Testing Context

Posted: July 4, 2022 | Thai Q. Ong, Dena A. Pastor

Applied Psychological Measurement: Volume 46, issue 2, page(s) 571-588

This study evaluates the degree to which position effects on two separate low-stakes tests administered to two different samples were moderated by different item (item length, number of response options, mental taxation, and graphic) and examinee (effort, change in effort, and gender) variables. Items exhibited significant negative linear position effects on both tests, with the magnitude of the position effects varying from item to item.

Category:Assessment-Oriented Research, Reliability/Validity

An Examination of the Associations Among USMLE Step 3 Scores and Likelihood of Disciplinary Action in Practice

Posted: June 7, 2022 | Monica M. Cuddy, Chunyan Liu, Wenli Ouyang, Michael A. Barone, Aaron Young, David A. Johnson

Academic Medicine: June 2022

This study examines the associations between Step 3 scores and subsequent receipt of disciplinary action taken by state medical boards for problematic behavior in practice. It analyzes Step 3 total, Step 3 computer-based case simulation (CCS), and Step 3multiple-choice question (MCQ) scores.

Category:Product-Oriented Research, USMLE, Assessment-Oriented Research, Reliability/Validity, Links to Outcomes

Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory

Posted: April 29, 2022 | Andrew A. White, Ann M. King, Angelo E. D’Addario, Karen Berg Brigham, Suzanne Dintzis, Emily E. Fay, Thomas H. Gallagher, Kathleen M. Mazor

JMIR Medical Education: Volume 8 - Issue 2 - e30988

This article aims to compare the reliability of two assessment groups (crowdsourced laypeople and patient advocates) in rating physician error disclosure communication skills using the Video-Based Communication Assessment app.

Category:Assessment-Oriented Research, Applications of Technology

In Reply to D'Eon and Kleinheksel

Posted: April 1, 2022 | Katie L. Arnhart, Monica M. Cuddy, David Johnson, Michael A. Barone, Aaron Young

Academic Medicine: Volume 97 - Issue 4 - Pages 476-477

Response to to emphasize that although findings support a relationship between multiple USMLE attempts and increased likelihood of receiving disciplinary actions, the findings in isolation are not sufficient for proposing new policy on how many attempts should be allowed.

Category:Product-Oriented Research, USMLE, Assessment-Oriented Research, Reliability/Validity, Links to Outcomes

Digital Module 28: Unusual Things That Usually Occur in a Credentialing Testing Program

Posted: March 17, 2022 | Richard A. Feinberg, Carol Morrison, Mark R. Raymond

Educational Measurement: Issues and Practices: Volume 41 - Issue 1 - Pages 95-96

Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. This module discusses some of these unusual challenges that usually occur in a credentialing program.

Category:Assessment-Oriented Research, General Measurement, Reliability/Validity

Leveraging Machine Learning Technology to Improve Accuracy and Efficiency of Identification of Enemy Item Pairs

Posted: January 1, 2022 | Ian Micir, Kimberly Swygert, Jean D'Angelo

Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40

The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.

Category:Assessment-Oriented Research, Scoring, Applications of Technology

Stay Up to Date

Stay Up to Date

Stay Up to Date

New Psychometric Workshops

INSIGHTS® Demo

Latin America Grants

Medical Student Organization Assistance

RESEARCH LIBRARY

Filter:

“Cephalgia” or “Migraine”? Solving the Headache of Assessing Clinical Reasoning Using Natural Language Processing

Effects of Data Preprocessing on Detecting Autism in Adults Using Web-Based Eye-Tracking Data

Uncovering the Complexity of Item Position Effects in a Low-Stakes Testing Context

An Examination of the Associations Among USMLE Step 3 Scores and Likelihood of Disciplinary Action in Practice

Video-Based Communication Assessment of Physician Error Disclosure Skills by Crowdsourced Laypeople and Patient Advocates Who Experienced Medical Harm: Reliability Assessment With Generalizability Theory

In Reply to D'Eon and Kleinheksel

Digital Module 28: Unusual Things That Usually Occur in a Credentialing Testing Program

Leveraging Machine Learning Technology to Improve Accuracy and Efficiency of Identification of Enemy Item Pairs