RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 11 - 20 of 27 Research Library Publications

Integrating Timing Considerations to Improve Testing Practices

Posted: June 25, 2020 | M.J. Margolis, R.A. Feinberg (eds)

Integrating Timing Considerations to Improve Testing Practices

This book synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring.

Category:Assessment-Oriented Research, General Measurement

A History of Test Speededness: Tracing the Evolution of Theory and Practice

Posted: June 25, 2020 | D. Jurich

Integrating Timing Considerations to Improve Testing Practices

This chapter presents a historical overview of the testing literature that exemplifies the theoretical and operational evolution of test speededness.

Category:Assessment-Oriented Research, General Measurement, Reliability/Validity

Examining the Precision of Cut Scores Within a Generalizability Theory Framework: A Closer Look at the Item Effect

Posted: June 3, 2020 | B. E. Clauser, M. Kane, J. C. Clauser

Journal of Educational Measurement: Volume 57, Issue 2, Pages 216-229

This article presents two generalizability-theory–based analyses of the proportion of the item variance that contributes to error in the cut score. For one approach, variance components are estimated on the probability (or proportion-correct) scale of the Angoff judgments, and for the other, the judgments are transferred to the theta scale of an item response theory model before estimating the variance components.

Category:Assessment-Oriented Research, Reliability/Validity

Handbook of Diagnostic Classification Models

Posted: August 31, 2019 | M. von Davier, YS. Lee

Springer International Publishing; 2019

This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.

Category:Assessment-Oriented Research, General Measurement, Scoring

Visualizing Hierarchical Score Inferences

Posted: June 6, 2019 | R.A. Feinberg, D.P Jurich

On the Cover. Educational Measurement: Issues and Practice, 38: 5-5

This informative graphic reports between‐individual information where a vertical line—with dashed lines on either side indicating an error band—spans three graphics allowing a student to easily see their score relative to four defined performance categories and, more notably, three relevant score distributions.

Category:Assessment-Oriented Research, Scoring

Moving the United States Medical Licensing Examination Step 1 After Core Clerkships: An Outcomes Analysis

Posted: March 1, 2019 | D. Jurich, M. Daniel, M. Paniagua, A. Fleming, V. Harnik, A. Pock, A. Swan-Sein, M. A. Barone, S.A. Santen

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 371-377

Schools undergoing curricular reform are reconsidering the optimal timing of Step 1. This study provides a psychometric investigation of the impact on United States Medical Licensing Examination Step 1 scores of changing the timing of Step 1 from after completion of the basic science curricula to after core clerkships.

Category:Product-Oriented Research, USMLE

Leveraging Natural Language Processing: Toward Computer-Assisted Scoring of Patient Notes in the USMLE Step 2 Clinical Skills Exam

Posted: March 1, 2019 | J. Salt, P. Harik, M. A. Barone

Academic Medicine: March 2019 - Volume 94 - Issue 3 - p 314-316

The United States Medical Licensing Examination Step 2 Clinical Skills (CS) exam uses physician raters to evaluate patient notes written by examinees. In this Invited Commentary, the authors describe the ways in which the Step 2 CS exam could benefit from adopting a computer-assisted scoring approach that combines physician raters’ judgments with computer-generated scores based on natural language processing (NLP).

Category:Assessment-Oriented Research, Scoring, Applications of Technology, Product-Oriented Research, USMLE

Effects of Discontinue Rules on Psychometric Properties of Test Scores

Posted: January 3, 2019 | M. von Davier, Y. Cho, T. Pan

Psychometrika 84, 147–163 (2019)

This paper provides results on a form of adaptive testing that is used frequently in intelligence testing. In these tests, items are presented in order of increasing difficulty. The presentation of items is adaptive in the sense that a session is discontinued once a test taker produces a certain number of incorrect responses in sequence, with subsequent (not observed) responses commonly scored as wrong.

Category:Assessment-Oriented Research, Scoring

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests Under the Random Groups Design

Posted: December 1, 2018 | C. Liu, M. J. Kolen

Journal of Educational Measurement: Volume 55, Issue 4, Pages 564-581

Smoothing techniques are designed to improve the accuracy of equating functions. The main purpose of this study is to compare seven model selection strategies for choosing the smoothing parameter (C) for polynomial loglinear presmoothing and one procedure for model selection in cubic spline postsmoothing for mixed‐format pseudo tests under the random groups design.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring

Does Incorporating a Measure of Clinical Workload Improve Workplace-Based Assessment Scores? Insights for Measurement Precision and Longitudinal Score Growth From Ten Pediatrics Residency Programs

Posted: October 30, 2018 | Y.S. Park, P.J. Hicks, C. Carraccio, M. Margolis, A. Schwartz

Academic Medicine: November 2018 - Volume 93 - Issue 11S - p S21-S29

This study investigates the impact of incorporating observer-reported workload into workplace-based assessment (WBA) scores on (1) psychometric characteristics of WBA scores and (2) measuring changes in performance over time using workload-unadjusted versus workload-adjusted scores.

Category:Assessment-Oriented Research, Scoring

Stay Up to Date

USMLE® Fee Assistance

Communication Learning Assessment

Introduction to Measurement Concepts: Validity and Reliability

NBME Academy

Latin America Grants

USMLE® Fee Assistance

RESEARCH LIBRARY

Filter:

Integrating Timing Considerations to Improve Testing Practices

A History of Test Speededness: Tracing the Evolution of Theory and Practice

Examining the Precision of Cut Scores Within a Generalizability Theory Framework: A Closer Look at the Item Effect

Handbook of Diagnostic Classification Models

Visualizing Hierarchical Score Inferences

Moving the United States Medical Licensing Examination Step 1 After Core Clerkships: An Outcomes Analysis

Leveraging Natural Language Processing: Toward Computer-Assisted Scoring of Patient Notes in the USMLE Step 2 Clinical Skills Exam

Effects of Discontinue Rules on Psychometric Properties of Test Scores

A Comparison of Strategies for Smoothing Parameter Selection for Mixed-Format Tests Under the Random Groups Design

Does Incorporating a Measure of Clinical Workload Improve Workplace-Based Assessment Scores? Insights for Measurement Precision and Longitudinal Score Growth From Ten Pediatrics Residency Programs