RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 11 - 20 of 24 Research Library Publications

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

Posted: June 1, 2018 | P. Harik, B. E. Clauser, I. Grabovsky, P. Baldwin, M. Margolis, D. Bucak, M. Jodoin, W. Walsh, S. Haist

Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327

The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring, Product-Oriented Research, USMLE

The Effects of Vignette Scoring on Reliability and Validity of Self-Reports

Posted: June 1, 2018 | M. von Davier, J. H. Shin, L. Khorramdel, L. Stankov

Applied Psychological Measurement: Volume: 42 issue: 4, page(s): 291-306

The research presented in this article combines mathematical derivations and empirical results to investigate effects of the nonparametric anchoring vignette approach proposed by King, Murray, Salomon, and Tandon on the reliability and validity of rating data. The anchoring vignette approach aims to correct rating data for response styles to improve comparability across individuals and groups.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring

Performance on the Nephrology In-Training Examination and ABIM Nephrology Certification Examination Outcomes

Posted: May 7, 2018 | D. Jurich, L. M. Duhigg, T. J. Plumb, S. A. Haist, J. L. Hawley, R. S. Lipner, L. Smith, S. M. Norby

CJASN May 2018, 13 (5) 710-717

Medical specialty and subspecialty fellowship programs administer subject-specific in-training examinations to provide feedback about level of medical knowledge to fellows preparing for subsequent board certification. This study evaluated the association between the American Society of Nephrology In-Training Examination and the American Board of Internal Medicine Nephrology Certification Examination in terms of scores and passing status.

Category:Assessment-Oriented Research, Reliability/Validity, Health Professions

A Novel Workplace-Based Assessment for Competency-Based Decisions and Learner Feedback

Posted: April 24, 2018 | P.J. Hicks, M.J. Margolis, C.L. Carraccio, B.E. Clauser, K. Donnelly, H.B. Fromme, K.A. Gifford, S.E. Poynter, D.J. Schumacher, A. Schwartz & the PMAC Module 1 Study Group

Medical Teacher: Volume 40 - Issue 11 - p 1143-1150

This study explores a novel milestone-based workplace assessment system that was implemented in 15 pediatrics residency programs. The system provided: web-based multisource feedback and structured clinical observation instruments that could be completed on any computer or mobile device; and monthly feedback reports that included competency-level scores and recommendations for improvement.

Category:Assessment-Oriented Research, General Measurement

The Use of Multivariate Generalizability Theory to Evaluate the Quality of Subscores

Posted: April 3, 2018 | Z. Jiang, M.R. Raymond

Applied Psychological Measurement: Volume: 42 issue: 8, page(s): 595-612

Conventional methods for evaluating the utility of subscores rely on reliability and correlation coefficients. However, correlations can overlook a notable source of variability: variation in subtest means/difficulties. Brennan introduced a reliability index for score profiles based on multivariate generalizability theory, designated as G, which is sensitive to variation in subtest difficulty. However, there has been little, if any, research evaluating the properties of this index. A series of simulation experiments, as well as analyses of real data, were conducted to investigate G under various conditions of subtest reliability, subtest correlations, and variability in subtest means.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring

Guest Editorial

Posted: April 3, 2018 | I. Kirsch, W. Thorn, M. von Davier

Quality Assurance in Education, Vol. 26 No. 2, pp. 150-152

An introduction to a special issue of Quality Assurance in Education featuring papers based on presentations at a two-day international seminar on managing the quality of data collection in large-scale assessments.

Category:Assessment-Oriented Research, General Measurement, Applications of Technology

Detecting and Treating Errors in Tests and Surveys

Posted: April 3, 2018 | M. von Davier

Quality Assurance in Education, Vol. 26 No. 2, pp. 243-262

Surveys that include skill measures may suffer from additional sources of error compared to those containing questionnaires alone. Examples are distractions such as noise or interruptions of testing sessions, as well as fatigue or lack of motivation to succeed. This paper aims to provide a review of statistical tools based on latent variable modeling approaches extended by explanatory variables that allow detection of survey errors in skill surveys.

Category:Assessment-Oriented Research, Other

Effects and Unforeseen Consequences of Accessing References on a Maintenance of Certification Examination

Posted: April 1, 2018 | R. A. Feinberg, D. P. Jurich, L. M. Foster

Academic Medicine: April 2018 - Volume 93 - Issue 4 - p 636-641

Increasing criticism of maintenance of certification (MOC) examinations has prompted certifying boards to explore alternative assessment formats. The purpose of this study was to examine the effect of allowing test takers to access reference material while completing their MOC Part III standardized examination.

Category:Assessment-Oriented Research, General Measurement

Diagnosing Diagnostic Models: From Von Neumann’s Elephant to Model Equivalencies and Network Psychometrics

Posted: March 30, 2018 | M. von Davier

Measurement: Interdisciplinary Research and Perspectives, 16:1, 59-70

This article critically reviews how diagnostic models have been conceptualized and how they compare to other approaches used in educational measurement. In particular, certain assumptions that have been taken for granted and used as defining characteristics of diagnostic models are reviewed and it is questioned whether these assumptions are the reason why these models have not had the success in operational analyses and large-scale applications, contrary to what many have hoped.

Category:Assessment-Oriented Research, General Measurement

It's Happening Sooner Than You Think: Spotlighting the Pre-Medical Realm

Posted: March 25, 2018 | B. Michalec, M. M. Cuddy, P. Hafferty, M. D. Hanson, S. L. Kanter, D. Littleton, M. A. T. Martimianakis, R. Michaels, F. W. Hafferty

Med Educ, 52: 359-361

Focusing specifically on examples set in the context of movement from Bachelor's level undergraduate programmes to enrolment in medical school, this publication argues that a great deal of what happens on college campuses today, curricular and otherwise, is (in)directly driven by the not‐so‐invisible hand of the medical education enterprise.

Category:Assessment-Oriented Research, Links to Outcomes, Health Professions

NBME Self-Assessment Bundles

Stay Up to Date

Stay Up to Date

New Psychometric Workshops

INSIGHTS℠ Demo

Open Grant Opportunities

RESEARCH LIBRARY

Filter:

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

The Effects of Vignette Scoring on Reliability and Validity of Self-Reports

Performance on the Nephrology In-Training Examination and ABIM Nephrology Certification Examination Outcomes

A Novel Workplace-Based Assessment for Competency-Based Decisions and Learner Feedback

The Use of Multivariate Generalizability Theory to Evaluate the Quality of Subscores

Guest Editorial

Detecting and Treating Errors in Tests and Surveys

Effects and Unforeseen Consequences of Accessing References on a Maintenance of Certification Examination

Diagnosing Diagnostic Models: From Von Neumann’s Elephant to Model Equivalencies and Network Psychometrics

It's Happening Sooner Than You Think: Spotlighting the Pre-Medical Realm