library bookshelves

RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 9 of 9 Research Library Publications
Posted: | Victoria Yaneva (editor), Matthias von Davier (editor)

Advancing Natural Language Processing in Educational Assessment

 

This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.

Posted: | Victoria Yaneva, Peter Baldwin, Le An Ha, Christopher Runyon

Advancing Natural Language Processing in Educational Assessment: Pages 167-182

 

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.

Posted: | Polina Harik, Janet Mee, Christopher Runyon, Brian E. Clauser

Advancing Natural Language Processing in Educational Assessment: Pages 58-73

 

This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.

Posted: | Ann King, Kathleen Mazor, Andrew Houriet, Thea Musselman, Ruth Hoppe, Angelo D’Addario

Patient Education and Counseling: Volume 109, Supplement, April 2023, Page 2

 

Physicians' responses to patient communication were assessed by both clinically matched and unmatched analogue patients (APs). Significant correlations between their ratings indicated consistency in evaluating physician communication skills. Thematic analysis identified twenty-one common themes in both clinically matched and unmatched AP responses, suggesting similar assessments of important behaviors. These findings imply that clinically unmatched APs can effectively substitute for clinically matched ones in evaluating physician communication and offering feedback when the latter are unavailable.

Posted: | Martin G. Tolsgaard, Martin V. Pusic, Stefanie S. Sebok-Syer, Brian Gin, Morten Bo Svendsen, Mark D. Syer, Ryan Brydges, Monica M. Cuddy, Christy K. Boscardin

Medical Teacher: Volume 45 - Issue 6, Pages 565-573

 

This guide aims aim to describe practical considerations involved in reading and conducting studies in medical education using Artificial Intelligence (AI), define basic terminology and identify which medical education problems and data are ideally-suited for using AI.

Posted: | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

 

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.

Posted: | S. H. Felgoise, R. A. Feinberg, H. B. Stephens, P. Barkhaus, K. Boylan, J. Caress, Z. Simmons

Muscle Nerve, 58: 646-654

 

The Amyotrophic Lateral Sclerosis (ALS)‐Specific Quality of Life instrument and its revised version (ALSSQOL and ALSSQOL‐R) have strong psychometric properties, and have demonstrated research and clinical utility. This study aimed to develop a short form (ALSSQOL‐SF) suitable for limited clinic time and patient stamina.

Posted: | I. Kirsch, W. Thorn, M. von Davier

Quality Assurance in Education, Vol. 26 No. 2, pp. 150-152

 

An introduction to a special issue of Quality Assurance in Education featuring papers based on presentations at a two-day international seminar on managing the quality of data collection in large-scale assessments.

Posted: | R.A. Feinberg, D. Jurich, J. Lord, H. Case, J. Hawley

Journal of Veterinary Medical Education 2018 45:3, 381-387

 

This study uses item response data from the November–December 2014 and April 2015 NAVLE administrations (n =5,292), to conduct timing analyses comparing performance across several examinee subgroups. The results provide evidence that conditions were sufficient for most examinees, thereby supporting the current time limits. For the relatively few examinees who may have been impacted, results suggest the cause is not a bias with the test but rather the effect of poor pacing behavior combined with knowledge deficits.