Showing 41 - 49 of 49 Research Library Publications
Posted: | M. Paniagua, J. Salt, K. Swygert, M. Barone

Journal of Medical Regulation (2018) 104 (2): 51–57

 

There have been a number of important stakeholder opinions critical of the Step 2 Clinical Skills Examination (CS) in the United States Medical Licensing Examination (USMLE) licensure sequence. The Resident Program Director (RPD) Awareness survey was convened to gauge perceptions of current and potential Step 2 CS use, attitudes towards the importance of residents' clinical skills, and awareness of a medical student petition against Step 2 CS. This was a cross-sectional survey which resulted in 205 responses from a representative sampling of RPDs across various specialties, regions and program sizes.

Posted: | R. A Feinberg, D. P. Jurich

Educational Measurement: Issues and Practice, 37: 5-8

 

This article spotlights the winners of the 2018 EM:IP Cover Graphic/Data Visualization Competition.

Posted: | P. Harik, B. E. Clauser, I. Grabovsky, P. Baldwin, M. Margolis, D. Bucak, M. Jodoin, W. Walsh, S. Haist

Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327

 

The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.

Posted: | M. von Davier, J. H. Shin, L. Khorramdel, L. Stankov

Applied Psychological Measurement: Volume: 42 issue: 4, page(s): 291-306

 

The research presented in this article combines mathematical derivations and empirical results to investigate effects of the nonparametric anchoring vignette approach proposed by King, Murray, Salomon, and Tandon on the reliability and validity of rating data. The anchoring vignette approach aims to correct rating data for response styles to improve comparability across individuals and groups.

Posted: | K. Short, S. D. Bucak, F. Rosenthal, M. R. Raymond

Academic Medicine: May 2018 - Volume 93 - Issue 5 - p 781-785

 

In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred.

Posted: | Z. Jiang, M.R. Raymond

Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612

 

Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.

Posted: | J. D. Rubright

Educational Measurement: Issues and Practice, 37: 40-45

 

This simulation study demonstrates that the strength of item dependencies and the location of an examination systems’ cut‐points both influence the accuracy (i.e., the sensitivity and specificity) of examinee classifications. Practical implications of these results are discussed in terms of false positive and false negative classifications of test takers.

Posted: | Monica M. Cuddy, Aaron Young, Andrew Gelman, David B. Swanson, David A. Johnson, Gerard F. Dillon, Brian E. Clauser

The authors examined the extent to which USMLE scores relate to the odds of receiving a disciplinary action from a U.S. state medical board.

Posted: | Ruth B. Hoppe, Ann M. King, Kathleen M. Mazor, Gail E. Furman, Penelope Wick-Garcia, Heather Corcoran–Ponisciak, Peter J. Katsufrakis

Academic Medicine: Volume 88 - Issue 11 - p 1670-1675

 

From 2007 through 2012, the NBME team reviewed literature in physician–patient communication, examined performance characteristics of the Step 2 CS exam, observed case development and quality assurance processes, interviewed SPs and their trainers, and reviewed video recordings of examinee–SP interactions.  The authors describe perspectives gained by their team from the review process and outline the resulting enhancements to the Step 2 CS exam, some of which were rolled out in June 2012.