Woman consulting library books

RESEARCH LIBRARY

This library features a subset of NBME research, which includes over 1,500 publications dating back to 1923. Check back as we continue to add both new and previously published research. 

View Historical Publications
Showing 1 - 10 of 34 Research Library Publications
Posted: November 22, 2021 | D. Jurich, M. Daniel, K.E. Hauer, C. Seibert, L. Chandran, A.R. Pock, S.B. Fazio, A. Fleming, S.A. Santeni

Teaching and Learning in Medicine: Volume 33 - Issue 4 - p 366-381

 

CSE scores for students from eight schools that moved Step 1 after core clerkships between 2012 and 2016 were analyzed in a pre-post format. Hierarchical linear modeling was used to quantify the effect of the curriculum on CSE performance. Additional analysis determined if clerkship order impacted clinical subject exam performance and whether the curriculum change resulted in more students scoring in the lowest percentiles before and after the curricular change.

Posted: November 1, 2020 | H. Kimberly, P.J. Hicks, M.J. Margolis, C. Carraccio, A. Osta, M.L. Winward, A. Schwartz

Academic Medicine: November 2020 - Volume 95 - Issue 11S - p S89-S94

 

Semiannually, U.S. pediatrics residency programs report resident milestone levels to the Accreditation Council for Graduate Medical Education (ACGME). The Pediatrics Milestones Assessment Collaborative (PMAC) developed workplace-based assessments of 2 inferences. The authors compared learner and program variance in PMAC scores with ACGME milestones.

Posted: June 25, 2020 | P. Harik, R.A. Feinberg RA, B.E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.

Posted: March 17, 2020 | R.A. Feinberg, M. von Davier

Journal of Educational and Behavioral Statistics: Vol 45, Issue 5, 2020

 

This article describes a method for identifying and reporting unexpectedly high or low subscores by comparing each examinee’s observed subscore with a discrete probability distribution of subscores conditional on the examinee’s overall ability.

Posted: August 31, 2019 | M. von Davier, YS. Lee

Springer International Publishing; 2019

 

This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.

Posted: June 6, 2019 | R.A. Feinberg, D.P Jurich

On the Cover. Educational Measurement: Issues and Practice, 38: 5-5

 

This informative graphic reports between‐individual information where a vertical line—with dashed lines on either side indicating an error band—spans three graphics allowing a student to easily see their score relative to four defined performance categories and, more notably, three relevant score distributions.

Posted: March 1, 2019 | E. Knetka, C. Runyon, S. Eddy

CBE—Life Sciences Education Vol. 18, No. 1

 

This article briefly reviews the aspects of validity that researchers should consider when using surveys. It then focuses on factor analysis, a statistical method that can be used to collect an important type of validity evidence.

Posted: March 1, 2019 | J. Salt, P. Harik, M. A. Barone

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316

 

The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).

Posted: January 3, 2019 | M. von Davier, Y. Cho, T. Pan

Psychometrika 84, 147–163 (2019)

 

This paper provides results on a form of adaptive testing that is used frequently in intelligence testing. In these tests, items are presented in order of increasing difficulty. The presentation of items is adaptive in the sense that a session is discontinued once a test taker produces a certain number of incorrect responses in sequence, with subsequent (not observed) responses commonly scored as wrong.

Posted: December 1, 2018 | C. Liu, M. J. Kolen

Journal of Educational Measurement, 55: 564-581

 

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed‐format pseudo tests under the random groups design.