Showing 1 - 8 of 8 Research Library Publications
Posted: | Victoria Yaneva (editor), Matthias von Davier (editor)

Advancing Natural Language Processing in Educational Assessment

 

This book examines the use of natural language technology in educational testing, measurement, and assessment. Recent developments in natural language processing (NLP) have enabled large-scale educational applications, though scholars and professionals may lack a shared understanding of the strengths and limitations of NLP in assessment as well as the challenges that testing organizations face in implementation. This first-of-its-kind book provides evidence-based practices for the use of NLP-based approaches to automated text and speech scoring, language proficiency assessment, technology-assisted item generation, gamification, learner feedback, and beyond.

Posted: | P. Harik, R.A. Feinberg RA, B.E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.

Posted: | M.J. Margolis, R.A. Feinberg (eds)

Integrating Timing Considerations to Improve Testing Practices

 

This book synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring. 

Posted: | M. J. Margolis, M. von Davier, B. E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses timing considerations in the context of other types of performance assessments and reports on a previously unpublished experiment examining timing with respect to performance on computer-based case simulations that are used in physician licensure.

Posted: | D. Jurich

Integrating Timing Considerations to Improve Testing Practices

 

This chapter presents a historical overview of the testing literature that exemplifies the theoretical and operational evolution of test speededness.

Posted: | B. E. Clauser, M. Kane, J. C. Clauser

Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229

 

This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.

Posted: | Z. Jiang, M.R. Raymond

Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612

 

Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.

Posted: | K. Walsh, P. Harik, K. Mazor, D. Perfetto, M. Anatchkova, C. Biggins, J. Wagner

Medical Care: April 2017 - Volume 55 - Issue 4 - p 436-441

 

The objective of this study is to identify modifiable factors that improve the reliability of ratings of severity of health care–associated harm in clinical practice improvement and research.