library bookshelves

RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 5 of 5 Research Library Publications
Posted: | Victoria Yaneva, Brian E. Clauser, Amy Morales, Miguel Paniagua

Journal of Educational Measurement: Volume 58, Issue 4, Pages 515-537

 

In this paper, the NBME team reports the results an eye-tracking study designed to evaluate how the presence of the options in multiple-choice questions impacts the way medical students responded to questions designed to evaluate clinical reasoning. Examples of the types of data that can be extracted are presented. We then discuss the implications of these results for evaluating the validity of inferences made based on the type of items used in this study.

Posted: | M. G. Jodoin, J. D. Rubright

Educational Measurement: Issues and Practice

 

This short, invited manuscript focuses on the implications for certification and licensure assessment organizations as a result of the wide‐spread disruptions caused by the COVID-19 pandemic. 

Posted: | M. J. Margolis, M. von Davier, B. E. Clauser

Integrating Timing Considerations to Improve Testing Practices

 

This chapter addresses timing considerations in the context of other types of performance assessments and reports on a previously unpublished experiment examining timing with respect to performance on computer-based case simulations that are used in physician licensure.

Posted: | M. J. Margolis, B. E. Clauser

Handbook of Automated Scoring

 

In this chapter we describe the historical background that led to development of the simulations and the subsequent refinement of the construct that occurred as the interface was being developed. We then describe the evolution of the automated scoring procedures from linear regression modeling to rule-based procedures.

Posted: | P. Baldwin, M.J. Margolis, B.E. Clauser, J. Mee, M. Winward

Educational Measurement: Issues and Practice, 39: 37-44

 

This article presents the results of an experiment in which content experts were randomly assigned to one of two response probability conditions: .67 and .80. If the standard-setting judgments collected with the bookmark procedure are internally consistent, both conditions should produce highly similar cut scores.