Journal of Educational Measurement: Volume 55, Issue 4, Pages 582-594
This article proposes and evaluates a new method that implements computerized adaptive testing (CAT) without any restriction on item review. In particular, it evaluates the new method in terms of the accuracy on ability estimates and the robustness against test‐manipulation strategies. This study shows that the newly proposed method is promising in a win‐win situation: examinees have full freedom to review and change answers, and the impacts of test‐manipulation strategies are undermined.
Journal of Pain and Symptom Management: Volume 56, Issue 3, p371-378
This article reviews the USMLE step examinations to determine whether they test the palliative care (PC) knowledge necessary for graduating medical students and residents applying for licensure.
Medical Teacher: Volume 40 - Issue 8 - p 838-841
Adaptive learning requires frequent and valid assessments for learners to track progress against their goals. This study determined if multiple-choice questions (MCQs) “crowdsourced” from medical learners could meet the standards of many large-scale testing programs.
Muscle Nerve, 58: 646-654
The Amyotrophic Lateral Sclerosis (ALS)‐Specific Quality of Life instrument and its revised version (ALSSQOL and ALSSQOL‐R) have strong psychometric properties, and have demonstrated research and clinical utility. This study aimed to develop a short form (ALSSQOL‐SF) suitable for limited clinic time and patient stamina.
Journal of Medical Regulation (2018) 104 (2): 51–57
There have been a number of important stakeholder opinions critical of the Step 2 Clinical Skills Examination (CS) in the United States Medical Licensing Examination (USMLE) licensure sequence. The Resident Program Director (RPD) Awareness survey was convened to gauge perceptions of current and potential Step 2 CS use, attitudes towards the importance of residents' clinical skills, and awareness of a medical student petition against Step 2 CS. This was a cross-sectional survey which resulted in 205 responses from a representative sampling of RPDs across various specialties, regions and program sizes.
Educational Measurement: Issues and Practice, 37: 5-8
This article spotlights the winners of the 2018 EM:IP Cover Graphic/Data Visualization Competition.
Journal of Graduate Medical Education: June 2018, Vol. 10, No. 3, pp. 337-338
To create examinations with scores that accurately support their intended interpretation and use in a particular setting, examination writers must clearly define what the test is intended to measure (the construct). Writers must also pay careful attention to how content is sampled, how questions are constructed, and how questions perform in their unique testing contexts.1–3 This Rip Out provides guidance for test developers to ensure that scores from MCQ examinations fit their intended purpose.
Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327
The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.
Alzheimer Disease & Associated Disorders: October–December 2018 - Volume 32 - Issue 4 - p 276-283
The purpose of this study is to examine the relationship between self-reports of cognitive complaints and quality of life (QOL) in persons with varying degrees of cognitive impairment.
Applied Psychological Measurement: Volume: 42 issue: 4, page(s): 291-306
The research presented in this article combines mathematical derivations and empirical results to investigate effects of the nonparametric anchoring vignette approach proposed by King, Murray, Salomon, and Tandon on the reliability and validity of rating data. The anchoring vignette approach aims to correct rating data for response styles to improve comparability across individuals and groups.