RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 1 - 5 of 5 Research Library Publications

Three Sources of Validation Evidence Needed to Evaluate the Quality of Generated Test Items for Medical Licensure

Posted: August 21, 2022 | Mark Gierl, Kimberly Swygert, Donna Matovinovic, Allison Kulesher, Hollis Lai

Teaching and Learning in Medicine: Volume 33 - Issue 4 - p 366-381

The purpose of this analysis is to describe these sources of evidence that can be used to evaluate the quality of generated items. The important role of medical expertise in the development and evaluation of the generated items is highlighted as a crucial requirement for producing validation evidence.

Category:Assessment-Oriented Research, Other

Application of Sampling Variance of Item Response Theory Parameter Estimates in Detecting Outliers in Common Item Equating

Posted: June 14, 2022 | Chunyan Liu, Daniel Jurich

Applied Psychological Measurement: Volume 46, issue 6, page(s) 529-547

The current simulation study demonstrated that the sampling variance associated with the item response theory (IRT) item parameter estimates can help detect outliers in the common items under the 2-PL and 3-PL IRT models. The results showed the proposed sampling variance statistic (SV) outperformed the traditional displacement method with cutoff values of 0.3 and 0.5 along a variety of evaluation criteria.

Category:Assessment-Oriented Research, General Measurement

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Posted: May 11, 2022 | Peter Baldwin, Brian E. Clauser

Journal of Educational Measurement: Volume 59, Issue 2, Pages 140-160

A conceptual framework for thinking about the problem of score comparability is given followed by a description of three classes of connectives. Examples from the history of innovations in testing are given for each class.

Category:Assessment-Oriented Research, Scoring

How Examinees Use Time

Posted: June 25, 2020 | P. Harik, R.A. Feinberg RA, B.E. Clauser

Integrating Timing Considerations to Improve Testing Practices

This chapter addresses a different aspect of the use of timing data: it provides a framework for understanding how an examinee's use of time interfaces with time limits to impact both test performance and the validity of inferences made based on test scores. It focuses primarily on examinations that are administered as part of the physician licensure process.

Category:Assessment-Oriented Research, General Measurement, Reliability/Validity

The Optimal Number of Options for Multiple-Choice Questions on High-Stakes Tests: Application of a Revised Index for Detecting Nonfunctional Distractors

Posted: October 25, 2018 | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.

Category:Assessment-Oriented Research, General Measurement

Stay Up to Date

USMLE® Fee Assistance

Communication Learning Assessment

Introduction to Measurement Concepts: Validity and Reliability

NBME Academy

Latin America Grants