library bookshelves

RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 11 - 20 of 29 Research Library Publications
Posted: | R. A Feinberg, D. P. Jurich

Educational Measurement: Issues and Practice, 37: 5-8

 

This article spotlights the winners of the 2018 EM:IP Cover Graphic/Data Visualization Competition.

Posted: | D. Franzen, M. Cuddy, J. S. Ilgen

Journal of Graduate Medical Education: June 2018, Vol. 10, No. 3, pp. 337-338

 

To create examinations with scores that accurately support their intended interpretation and use in a particular setting, examination writers must clearly define what the test is intended to measure (the construct). Writers must also pay careful attention to how content is sampled, how questions are constructed, and how questions perform in their unique testing contexts.1–3 This Rip Out provides guidance for test developers to ensure that scores from MCQ examinations fit their intended purpose.

Posted: | P. Harik, B. E. Clauser, I. Grabovsky, P. Baldwin, M. Margolis, D. Bucak, M. Jodoin, W. Walsh, S. Haist

Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327

 

The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.

Posted: | S. D. Stites, K. Harkins, J. D. Rubright, J. Karlawish

Alzheimer Disease & Associated Disorders: October–December 2018 - Volume 32 - Issue 4 - p 276-283

 

The purpose of this study is to examine the relationship between self-reports of cognitive complaints and quality of life (QOL) in persons with varying degrees of cognitive impairment.

Posted: | M. von Davier, J. H. Shin, L. Khorramdel, L. Stankov

Applied Psychological Measurement: Volume: 42 issue: 4, page(s): 291-306

 

The research presented in this article combines mathematical derivations and empirical results to investigate effects of the nonparametric anchoring vignette approach proposed by King, Murray, Salomon, and Tandon on the reliability and validity of rating data. The anchoring vignette approach aims to correct rating data for response styles to improve comparability across individuals and groups.

Posted: | D. Jurich, L. M. Duhigg, T. J. Plumb, S. A. Haist, J. L. Hawley, R. S. Lipner, L. Smith, S. M. Norby

CJASN May 2018, 13 (5) 710-717

 

Medical specialty and subspecialty fellowship programs administer subject-specific in-training examinations to provide feedback about level of medical knowledge to fellows preparing for subsequent board certification. This study evaluated the association between the American Society of Nephrology In-Training Examination and the American Board of Internal Medicine Nephrology Certification Examination in terms of scores and passing status.

Posted: | K. Short, S. D. Bucak, F. Rosenthal, M. R. Raymond

Academic Medicine: May 2018 - Volume 93 - Issue 5 - p 781-785

 

In 2007, the United States Medical Licensing Examination embedded multimedia simulations of heart sounds into multiple-choice questions. This study investigated changes in item difficulty as determined by examinee performance over time. The data reflect outcomes obtained following initial use of multimedia items from 2007 through 2012, after which an interface change occurred.

Posted: | P.J. Hicks, M.J. Margolis, C.L. Carraccio, B.E. Clauser, K. Donnelly, H.B. Fromme, K.A. Gifford, S.E. Poynter, D.J. Schumacher, A. Schwartz & the PMAC Module 1 Study Group

Medical Teacher: Volume 40 - Issue 11 - p 1143-1150

 

This study explores a novel milestone-based workplace assessment system that was implemented in 15 pediatrics residency programs. The system provided: web-based multisource feedback and structured clinical observation instruments that could be completed on any computer or mobile device; and monthly feedback reports that included competency-level scores and recommendations for improvement.

Posted: | Z. Jiang, M.R. Raymond

Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612

 

Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.

Posted: | I. Kirsch, W. Thorn, M. von Davier

Quality Assurance in Education, Vol. 26 No. 2, pp. 150-152

 

An introduction to a special issue of Quality Assurance in Education featuring papers based on presentations at a two-day international seminar on managing the quality of data collection in large-scale assessments.