Advancing Natural Language Processing in Educational Assessment
This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.
Advancing Natural Language Processing in Educational Assessment: Pages 167-182
This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.
Advancing Natural Language Processing in Educational Assessment: Pages 58-73
This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.
Medical Teacher: Volume 45 - Issue 6, Pages 565-573
This guide aims aim to describe practical considerations involved in reading and conducting studies in medical education using Artificial Intelligence (AI), define basic terminology and identify which medical education problems and data are ideally-suited for using AI.
Neural Engineering Techniques for Autism Spectrum Disorder: Volume 2, Pages 63-79
Automated detection of high-functioning autism in adults is a highly challenging and understudied problem. In search of a way to automatically detect the condition, this chapter explores how eye-tracking data from reading tasks can be used.
Diagnosis: Volume 10, Issue 1, Pages 54-60
This op-ed discusses the advantages of leveraging natural language processing (NLP) in the assessment of clinical reasoning. It also provides an overview of INCITE, the Intelligent Clinical Text Evaluator, a scalable NLP-based computer-assisted scoring system that was developed to measure clinical reasoning ability as assessed in the written documentation portion of the now-discontinued USMLE Step 2 Clinical Skills examination.
Behavior & Information Technology
This study builds upon prior work in this area that focused on developing a machine-learning classifier trained on gaze data from web-related tasks to detect ASD in adults. Using the same data, we show that a new data pre-processing approach, combined with an exploration of the performance of different classification algorithms, leads to an increased classification accuracy compared to prior work.
JMIR Medical Education: Volume 8 - Issue 2 - e30988
This article aims to compare the reliability of two assessment groups (crowdsourced laypeople and patient advocates) in rating physician error disclosure communication skills using the Video-Based Communication Assessment app.
Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40
The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.
Journal of Educational Measurement: Volume 58, Issue 4, Pages 515-537
In this paper, the NBME team reports the results an eye-tracking study designed to evaluate how the presence of the options in multiple-choice questions impacts the way medical students responded to questions designed to evaluate clinical reasoning. Examples of the types of data that can be extracted are presented. We then discuss the implications of these results for evaluating the validity of inferences made based on the type of items used in this study.