Showing 1 - 7 of 7 Research Library Publications
Posted: | King Yiu Suen, Victoria Yaneva, Le An Ha, Janet Mee, Yiyun Zhou, Polina Harik

Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), Pages 443-447

 

This paper presents the ACTA system, which performs automated short-answer grading in the domain of high-stakes medical exams. The system builds upon previous work on neural similarity-based grading approaches by applying these to the medical domain and utilizing contrastive learning as a means to optimize the similarity metric. 

Posted: | Victoria Yaneva, Peter Baldwin, Le An Ha, Christopher Runyon

Advancing Natural Language Processing in Educational Assessment: Pages 167-182

 

This chapter discusses the evolution of natural language processing (NLP) approaches to text representation and how different ways of representing text can be utilized for a relatively understudied task in educational assessment – that of predicting item characteristics from item text.

Posted: | Polina Harik, Janet Mee, Christopher Runyon, Brian E. Clauser

Advancing Natural Language Processing in Educational Assessment: Pages 58-73

 

This chapter describes INCITE, an NLP-based system for scoring free-text responses. It emphasizes the importance of context and the system’s intended use and explains how each component of the system contributed to its accuracy.

Posted: | Mark Gierl, Kimberly Swygert, Donna Matovinovic, Allison Kulesher, Hollis Lai

Teaching and Learning in Medicine: Volume 33 - Issue 4 - p 366-381

 

The purpose of this analysis is to describe these sources of evidence that can be used to evaluate the quality of generated items. The important role of medical expertise in the development and evaluation of the generated items is highlighted as a crucial requirement for producing validation evidence.

Posted: | P. Harik, R.A. Feinberg RA, B.E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.

Posted: | M. J. Margolis, B. E. Clauser

Handbook of Automated Scoring

 

In this chapter we describe the historical background that led to development of the simulations and the subsequent refinement of the construct that occurred as the interface was being developed. We then describe the evolution of the automated scoring procedures from linear regression modeling to rule-based procedures.

Posted: | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

 

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.