Showing 1 - 8 of 8 Research Library Publications
Posted: | Victoria Yaneva (editor), Matthias von Davier (editor)

Advancing Natural Language Processing in Educational Assessment

 

This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.

Posted: | Martin G. Tolsgaard, Martin V. Pusic, Stefanie S. Sebok-Syer, Brian Gin, Morten Bo Svendsen, Mark D. Syer, Ryan Brydges, Monica M. Cuddy, Christy K. Boscardin

Medical Teacher: Volume 45 - Issue 6, Pages 565-573

 

This guide aims aim to describe practical considerations involved in reading and conducting studies in medical education using Artificial Intelligence (AI), define basic terminology and identify which medical education problems and data are ideally-suited for using AI.

Posted: | Ian Micir, Kimberly Swygert, Jean D'Angelo

Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40

 

The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.

Posted: | Peter Baldwin

Educational Measurement: Issues and Practice

 

This article aims to answer the question: when the assumption that examinees may apply themselves fully yet still respond incorrectly is violated, what are the consequences of using the modified model proposed by Lewis and his colleagues? 

Posted: | Martin G. Tolsgaard, Christy K. Boscardin, Yoon Soo Park, Monica M. Cuddy, Stefanie S. Sebok-Syer

Advances in Health Sciences Education: Volume 25, p 1057–1086 (2020)

 

This critical review explores: (1) published applications of data science and ML in HPE literature and (2) the potential role of data science and ML in shifting theoretical and epistemological perspectives in HPE research and practice.

Posted: | B. E. Clauser, M. Kane, J. C. Clauser

Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229

 

This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.

Posted: | B.C. Leventhal, I. Grabovsky

Educational Measurement: Issues and Practice, 39: 30-36

 

This article proposes the conscious weight method and subconscious weight method to bring more objectivity to the standard setting process. To do this, these methods quantify the relative harm of the negative consequences of false positive and false negative misclassification.

Posted: | P. Baldwin, M.J. Margolis, B.E. Clauser, J. Mee, M. Winward

Educational Measurement: Issues and Practice, 39: 37-44

 

This article presents the results of an experiment in which content experts were randomly assigned to one of two response probability conditions: .67 and .80. If the standard-setting judgments collected with the bookmark procedure are internally consistent, both conditions should produce highly similar cut scores.