Woman consulting library books

RESEARCH LIBRARY

View recent publications to learn how the NBME research team is working to improve our products and services, advance the field of assessment science, and support the health professions.

Showing 31 - 40 of 69 Research Library Publications
Posted: March 1, 2019 | D. Jurich, M. Daniel, M. Paniagua, A. Fleming, V. Harnik, A. Pock, A. Swan-Sein, M. A. Barone, S.A. Santen

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 371-377

 

Schools undergoing curricular reform are reconsidering the optimal timing of Step 1. This study provides a psychometric investigation of the impact on United States Medical Licensing Examination Step 1 scores of changing the timing of Step 1 from after completion of the basic science curricula to after core clerkships.

Posted: March 1, 2019 | E. Knetka, C. Runyon, S. Eddy

CBE—Life Sciences Education Vol. 18, No. 1

 

This article briefly reviews the aspects of validity that researchers should consider when using surveys. It then focuses on factor analysis, a statistical method that can be used to collect an important type of validity evidence.

Posted: March 1, 2019 | J. Salt, P. Harik, M. A. Barone

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316

 

The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).

Posted: January 16, 2019 | P. Baldwin, M.J. Margolis, B.E. Clauser, J. Mee, M. Winward

Educational Measurement: Issues and Practice, 39: 37-44

 

This article presents the results of an experiment in which content experts were randomly assigned to one of two response probability conditions: .67 and .80. If the standard-setting judgments collected with the bookmark procedure are internally consistent, both conditions should produce highly similar cut scores.

Posted: January 3, 2019 | M. von Davier, Y. Cho, T. Pan

Psychometrika 84, 147–163 (2019)

 

This paper provides results on a form of adaptive testing that is used frequently in intelligence testing. In these tests, items are presented in order of increasing difficulty. The presentation of items is adaptive in the sense that a session is discontinued once a test taker produces a certain number of incorrect responses in sequence, with subsequent (not observed) responses commonly scored as wrong.

Posted: January 1, 2019 | M. Paniagua, P. Katsufrakis

Investigación en Educación Médica, Vol. 8, Núm. 29, 2019

Posted: December 1, 2018 | C. Liu, M. J. Kolen

Journal of Educational Measurement: Volume 55, Issue 4, Pages 564-581

 

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed‐format pseudo tests under the random groups design.

Posted: October 30, 2018 | Y.S. Park, P.J. Hicks, C. Carraccio, M. Margolis, A. Schwartz

Academic Medicine: November 2018 - Volume 93 - Issue 11S - p S21-S29

 

This study investigates the impact of incorporating observer-reported workload into workplace-based assessment (WBA) scores on (1) psychometric characteristics of WBA scores and (2) measuring changes in performance over time using workload-unadjusted versus workload-adjusted scores.

Posted: October 30, 2018 | S. Pohl, M. von Davier

Front. Psychol. 9:1988

 

In their 2018 article, (T&B) discuss how to deal with not reached items due to low working speed in ability tests (Tijmstra and Bolsinova, 2018). An important contribution of the paper is focusing on the question of how to define the targeted ability measure. This note aims to add further aspects to this discussion and to propose alternative approaches.

Posted: October 25, 2018 | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

 

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.