Woman consulting library books

RESEARCH LIBRARY

This library features a subset of NBME research, which includes over 1,500 publications dating back to 1923. Check back as we continue to add both new and previously published research. 

View Historical Publications
Showing 1 - 10 of 17 Research Library Publications
Posted: November 3, 2020 | M.G. Tolsgaard, C.K. Boscardin, Y.S. Park, M.M. Cuddy, S.S. Sebok-Syer

Advances in Health Sciences Education: Volume 25, p 1057–1086 (2020)

 

This critical review explores: (1) published applications of data science and ML in HPE literature and (2) the potential role of data science and ML in shifting theoretical and epistemological perspectives in HPE research and practice.

Posted: June 25, 2020 | P. Harik, R.A. Feinberg RA, B.E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.

Posted: June 25, 2020 | M.J. Margolis, R.A. Feinberg (eds)

Integrating Timing Considerations to Improve Testing Practices

 

This book synthesizes a wealth of theory and research on time issues in assessment into actionable advice for test development, administration, and scoring. 

Posted: June 25, 2020 | D. Jurich

Integrating Timing Considerations to Improve Testing Practices

 

This chapter presents a historical overview of the testing literature that exemplifies the theoretical and operational evolution of test speededness.

Posted: February 26, 2020 | B.C. Leventhal, I. Grabovsky

Educational Measurement: Issues and Practice, 39: 30-36

 

This article proposes the conscious weight method and subconscious weight method to bring more objectivity to the standard setting process. To do this, these methods quantify the relative harm of the negative consequences of false positive and false negative misclassification.

Posted: August 31, 2019 | M. von Davier, YS. Lee

Springer International Publishing; 2019

 

This handbook provides an overview of major developments around diagnostic classification models (DCMs) with regard to modeling, estimation, model checking, scoring, and applications. It brings together not only the current state of the art, but also the theoretical background and models developed for diagnostic classification.

Posted: January 16, 2019 | P. Baldwin, M.J. Margolis, B.E. Clauser, J. Mee, M. Winward

Educational Measurement: Issues and Practice, 39: 37-44

 

This article presents the results of an experiment in which content experts were randomly assigned to one of two response probability conditions: .67 and .80. If the standard-setting judgments collected with the bookmark procedure are internally consistent, both conditions should produce highly similar cut scores.

Posted: October 30, 2018 | S. Pohl, M. von Davier

Front. Psychol. 9:1988

 

In their 2018 article, (T&B) discuss how to deal with not reached items due to low working speed in ability tests (Tijmstra and Bolsinova, 2018). An important contribution of the paper is focusing on the question of how to define the targeted ability measure. This note aims to add further aspects to this discussion and to propose alternative approaches.

Posted: October 25, 2018 | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

 

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.

Posted: October 1, 2018 | Z. Cui, C. Liu, Y. He, H. Chen

Journal of Educational Measurement, 55: 582-594

 

This article proposes and evaluates a new method that implements computerized adaptive testing (CAT) without any restriction on item review. In particular, it evaluates the new method in terms of the accuracy on ability estimates and the robustness against test‐manipulation strategies. This study shows that the newly proposed method is promising in a win‐win situation: examinees have full freedom to review and change answers, and the impacts of test‐manipulation strategies are undermined.