Applied Psychological Measurement: Volume 46, issue 2, page(s) 571-588
This study evaluates the degree to which position effects on two separate low-stakes tests administered to two different samples were moderated by different item (item length, number of response options, mental taxation, and graphic) and examinee (effort, change in effort, and gender) variables. Items exhibited significant negative linear position effects on both tests, with the magnitude of the position effects varying from item to item.
Applied Psychological Measurement: Volume 46, issue 6, page(s) 529-547
The current simulation study demonstrated that the sampling variance associated with the item response theory (IRT) item parameter estimates can help detect outliers in the common items under the 2-PL and 3-PL IRT models. The results showed the proposed sampling variance statistic (SV) outperformed the traditional displacement method with cutoff values of 0.3 and 0.5 along a variety of evaluation criteria.
Journal of Educational Measurement: Volume 59, Issue 2, Pages 140-160
A conceptual framework for thinking about the problem of score comparability is given followed by a description of three classes of connectives. Examples from the history of innovations in testing are given for each class.
Educational Measurement: Issues and Practice
This article aims to answer the question: when the assumption that examinees may apply themselves fully yet still respond incorrectly is violated, what are the consequences of using the modified model proposed by Lewis and his colleagues?
Academic Medicine: Volume 95 - Issue 11S - Pages S89-S94
Semiannually, U.S. pediatrics residency programs report resident milestone levels to the Accreditation Council for Graduate Medical Education (ACGME). The Pediatrics Milestones Assessment Collaborative (PMAC) developed workplace-based assessments of 2 inferences. The authors compared learner and program variance in PMAC scores with ACGME milestones.
Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229
This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.
Medical Teacher: Volume 40 - Issue 11 - p 1143-1150
This study explores a novel milestone-based workplace assessment system that was implemented in 15 pediatrics residency programs. The system provided: web-based multisource feedback and structured clinical observation instruments that could be completed on any computer or mobile device; and monthly feedback reports that included competency-level scores and recommendations for improvement.