
RESEARCH LIBRARY
RESEARCH LIBRARY
View the latest publications from members of the NBME research team
Academic Medicine: Volume 99 - Issue 7 - Pages 778-783
This study examined score comparability between in-person and remote proctored administrations of the 2020 Internal Medicine In-Training Examination (IM-ITE) during the COVID-19 pandemic. Analysis of data from 27,115 IM residents revealed statistically significant but educationally nonsignificant differences in predicted scores, with slightly larger variations observed for first-year residents. Overall, performance did not substantially differ between the two testing modalities, supporting the continued use of remote proctoring for the IM-ITE amidst pandemic-related disruptions.
Evaluation & the Health Professions: Volume: 43 issue: 3, page(s): 149-158
This study examines the innovative and practical application of DCM framework to health professions educational assessments using retrospective large-scale assessment data from the basic and clinical sciences: National Board of Medical Examiners Subject Examinations in pathology (n = 2,006) and medicine (n = 2,351).
Applied Psychological Measurement: Volume: 42 issue: 4, page(s): 291-306
The research presented in this article combines mathematical derivations and empirical results to investigate effects of the nonparametric anchoring vignette approach proposed by King, Murray, Salomon, and Tandon on the reliability and validity of rating data. The anchoring vignette approach aims to correct rating data for response styles to improve comparability across individuals and groups.
Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612
Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.