Applied Psychological Measurement: Volume 47, issue 1, page(s) 34-47
This study used simulation to investigate the performance of the t-test method in detecting outliers and compared its performance with other outlier detection methods, including the logit difference method with 0.5 and 0.3 as the cutoff values and the robust z statistic with 2.7 as the cutoff value.
Proceedings of the 28th International Conference on Computational Linguistics
This paper brings together approaches from the fields of NLP and psychometric measurement to address the problem of predicting examinee proficiency from responses to short-answer questions (SAQs).
Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612
Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.