Educational Measurement: Issues and Practice, 39: 30-36
This article proposes the conscious weight method and subconscious weight method to bring more objectivity to the standard setting process. To do this, these methods quantify the relative harm of the negative consequences of false positive and false negative misclassification.
Academic Medicine: Volume 95 - Issue 1 - p 111-121
This paper investigates the effect of a change in the United States Medical Licensing Examination Step 1 timing on Step 2 Clinical Knowledge (CK) scores, the effect of lag time on Step 2 CK performance, and the relationship of incoming Medical College Admission Test (MCAT) score to Step 2 CK performance pre and post change.
Springer International Publishing; 2019
This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.
Academic Medicine: July 2019 - Volume 94 - Issue 7 - p 926-927
A response to concerns regarding potential bias in the implementation of machine learning (ML) to scoring of the United States Medical Licensing Examination Step 2 Clinical Skills (CS) patient notes (PN).
On the Cover. Educational Measurement: Issues and Practice, 38: 5-5
This informative graphic reports between‐individual information where a vertical line—with dashed lines on either side indicating an error band—spans three graphics allowing a student to easily see their score relative to four defined performance categories and, more notably, three relevant score distributions.
J Gen Intern Med 34, 705–711 (2019)
This study examines medical student accounts of EHR use during their internal medicine (IM) clerkships and sub-internships during a 5-year time period prior to the new clinical documentation guidelines.
Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 371-377
Schools undergoing curricular reform are reconsidering the optimal timing of Step 1. This study provides a psychometric investigation of the impact on United States Medical Licensing Examination Step 1 scores of changing the timing of Step 1 from after completion of the basic science curricula to after core clerkships.
CBE—Life Sciences Education Vol. 18, No. 1
This article briefly reviews the aspects of validity that researchers should consider when using surveys. It then focuses on factor analysis, a statistical method that can be used to collect an important type of validity evidence.
Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316
The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).
Educational Measurement: Issues and Practice, 39: 37-44
This article presents the results of an experiment in which content experts were randomly assigned to one of two response probability conditions: .67 and .80. If the standard-setting judgments collected with the bookmark procedure are internally consistent, both conditions should produce highly similar cut scores.