RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 11 - 20 of 50 Research Library Publications

Application of Sampling Variance of Item Response Theory Parameter Estimates in Detecting Outliers in Common Item Equating

Posted: June 14, 2022 | Chunyan Liu, Daniel Jurich

Applied Psychological Measurement: Volume 46, issue 6, page(s) 529-547

The current simulation study demonstrated that the sampling variance associated with the item response theory (IRT) item parameter estimates can help detect outliers in the common items under the 2-PL and 3-PL IRT models. The results showed the proposed sampling variance statistic (SV) outperformed the traditional displacement method with cutoff values of 0.3 and 0.5 along a variety of evaluation criteria.

Category:Assessment-Oriented Research, General Measurement

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Posted: May 11, 2022 | Peter Baldwin, Brian E. Clauser

Journal of Educational Measurement: Volume 59, Issue 2, Pages 140-160

A conceptual framework for thinking about the problem of score comparability is given followed by a description of three classes of connectives. Examples from the history of innovations in testing are given for each class.

Category:Assessment-Oriented Research, Scoring

Digital Module 28: Unusual Things That Usually Occur in a Credentialing Testing Program

Posted: March 17, 2022 | Richard A. Feinberg, Carol Morrison, Mark R. Raymond

Educational Measurement: Issues and Practices: Volume 41 - Issue 1 - Pages 95-96

Often unanticipated situations arise that can create a range of problems from threats to score validity, to unexpected financial costs, and even longer-term reputational damage. This module discusses some of these unusual challenges that usually occur in a credentialing program.

Category:Assessment-Oriented Research, General Measurement, Reliability/Validity

Leveraging Machine Learning Technology to Improve Accuracy and Efficiency of Identification of Enemy Item Pairs

Posted: January 1, 2022 | Ian Micir, Kimberly Swygert, Jean D'Angelo

Journal of Applied Technology: Volume 23 - Special Issue 1 - Pages 30-40

The interpretations of test scores in secure, high-stakes environments are dependent on several assumptions, one of which is that examinee responses to items are independent and no enemy items are included on the same forms. This paper documents the development and implementation of a C#-based application that uses Natural Language Processing (NLP) and Machine Learning (ML) techniques to produce prioritized predictions of item enemy statuses within a large item bank.

Category:Assessment-Oriented Research, Scoring, Applications of Technology

Gender Comparison in Milestone Trajectories and Medical Knowledge Examination Scores among Internal Medicine Residents

Posted: May 25, 2021 | Karen E. Hauer, Daniel Jurich, Jonathan Vandergrift, Rebecca S. Lipner, Furman S. McDonald, Kenji Yamazaki, Davoren Chick, Kevin McAllister, Eric S. Holmboe

Academic Medicine: Volume 96 - Issue 6 - p 876-884(9)

This study examines whether there are group differences in milestone ratings submitted by program directors working with clinical competency committees based on gender for internal medicine residents and whether women and men rated similarly on subsequent in-training and certification examinations.

Category:Assessment-Oriented Research, General Measurement

Automated Prediction Of Examinee Proficiency From Short-Answer Questions

Posted: December 10, 2020 | Le An Ha, Victoria Yaneva, Polina Harik, Ravi Pandian, Amy Morales, Brian Clauser

Proceedings of the 28th International Conference on Computational Linguistics

This paper brings together approaches from the fields of NLP and psychometric measurement to address the problem of predicting examinee proficiency from responses to short-answer questions (SAQs).

Category:Assessment-Oriented Research, Scoring

A Problem with the Bookmark Procedure's Correction for Guessing

Posted: November 24, 2020 | Peter Baldwin

Educational Measurement: Issues and Practice

This article aims to answer the question: when the assumption that examinees may apply themselves fully yet still respond incorrectly is violated, what are the consequences of using the modified model proposed by Lewis and his colleagues?

Category:Assessment-Oriented Research, General Measurement

The Role of Data Science and Machine Learning in Health Professions Education: Practical Applications, Theoretical Contributions, and Epistemic Beliefs

Posted: November 3, 2020 | Martin G. Tolsgaard, Christy K. Boscardin, Yoon Soo Park, Monica M. Cuddy, Stefanie S. Sebok-Syer

Advances in Health Sciences Education: Volume 25, p 1057–1086 (2020)

This critical review explores: (1) published applications of data science and ML in HPE literature and (2) the potential role of data science and ML in shifting theoretical and epistemological perspectives in HPE research and practice.

Category:Assessment-Oriented Research, General Measurement, Health Professions

Reporting Subscore Profiles Using Diagnostic Classification Models in Health Professions Education

Posted: September 1, 2020 | Y.S. Park, A. Morales, L. Ross, M. Paniagua

Evaluation & the Health Professions: Volume: 43 issue: 3, page(s): 149-158

This study examines the innovative and practical application of DCM framework to health professions educational assessments using retrospective large-scale assessment data from the basic and clinical sciences: National Board of Medical Examiners Subject Examinations in pathology (n = 2,006) and medicine (n = 2,351).

Category:Assessment-Oriented Research, Scoring, Product-Oriented Research, NBME

Correlations Between the USMLE Step Examinations, American College of Physicians In-Training Examination, and ABIM Internal Medicine Certification Examination

Posted: September 1, 2020 | F.S. McDonald, D. Jurich, L.M. Duhigg, M. Paniagua, D. Chick, M. Wells, A. Williams, P. Alguire

Academic Medicine: September 2020 - Volume 95 - Issue 9 - p 1388-1395

This article aims to assess the correlations between United States Medical Licensing Examination (USMLE) performance, American College of Physicians Internal Medicine In-Training Examination (IM-ITE) performance, American Board of Internal Medicine Internal Medicine Certification Exam (IM-CE) performance, and other medical knowledge and demographic variables.

Category:Assessment-Oriented Research, Scoring, Links to Outcomes, Product-Oriented Research, USMLE

Stay Up to Date

USMLE® Fee Assistance

Communication Learning Assessment

Introduction to Measurement Concepts: Validity and Reliability

NBME Academy

Latin America Grants

USMLE® Fee Assistance

RESEARCH LIBRARY

Filter:

Application of Sampling Variance of Item Response Theory Parameter Estimates in Detecting Outliers in Common Item Equating

Historical Perspectives on Score Comparability Issues Raised by Innovations in Testing

Digital Module 28: Unusual Things That Usually Occur in a Credentialing Testing Program

Leveraging Machine Learning Technology to Improve Accuracy and Efficiency of Identification of Enemy Item Pairs

Gender Comparison in Milestone Trajectories and Medical Knowledge Examination Scores among Internal Medicine Residents

Automated Prediction Of Examinee Proficiency From Short-Answer Questions

A Problem with the Bookmark Procedure's Correction for Guessing

The Role of Data Science and Machine Learning in Health Professions Education: Practical Applications, Theoretical Contributions, and Epistemic Beliefs

Reporting Subscore Profiles Using Diagnostic Classification Models in Health Professions Education

Correlations Between the USMLE Step Examinations, American College of Physicians In-Training Examination, and ABIM Internal Medicine Certification Examination