NBME logo
Date Updated: May 29, 2018

Staff Presentations 2008-Present

NBME® staff members present their research findings on a wide array of subjects at conferences around the world. In the past ten years, some of the most consistently revisited topics have been the evaluation of clinical skills, the use of standardized patients, the assessment of professionalism, and progress testing. Most recently, staff members have investigated the topic of exam timing in depth. Much of their research was presented at the Timing Impact on Measurement in Education (TIME) conference, hosted by the NBME in Philadelphia, PA, from October 9-10, 2017.

Please contact the Office of Research at ors@nbme.org if you would like more information about a presentation.

These presentations are owned and copyrighted by the National Board of Medical Examiners® (NBME®). All rights reserved.


    Baldwin P, King A, Margolis M, Clauser BE, Mee J. Standard setting for the USMLE Step 2 Clinical Skills Exam: a case study. Ottawa Conference; March 2018. Abu Dhabi, United Arab Emirates.
    Braun H, von Davier M. The use of test scores from large-scale assessments: psychometric and statistical considerations. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Camara WJ, Ho A, Burstin H, Haist SA, Kadriye E. How is health measurement like educational measurement? Toward cross-disciplinary standards. American Educational Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Clauser A, Jurich D. Providing actionable feedback on a high-stakes licensure examination. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Clauser A, Park Y. Examining the impact of profile band feedback for failing examinees. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Clauser B, Margolis M, Sireci S, Camara W, van der Linden W. Using timing information to create better tests. Association of Test Publishers (ATP) Innovations in Testing 2018; February 2018. San Antonio, TX.
    Cuddy M, Pittenger A, O'Brien B, Hanson J, McDonough S, Soh M, Paniagua M. Assessment issues in interprofessional collaboration in the healthcare professions. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Dekhtyar M, Ross LP, D'Angelo J, Guernsey J, Hawkins RE. Validity of the Health Systems Science Examination: relationship between examinee performance, instruction, and curriculum design. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Dubas U, Harik P, Cuddy M, Murray C, Artman C. Developing scoring guidelines for performance assessments using expert judgments: an exploratory study. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Feinberg R. The reality of subscore reporting: balancing measurement and policy perspectives. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Feinberg R, Baldwin P. Investigating validity and precision from shortening a test in response to speededness. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Feinberg R, Jurich D. Using rapid responses to evaluate test speededness. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Grabovsky I, Clauser A. Estimating stringency parameters for standardized patients in a large-scale performance assessment. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Grabovsky I, Leventhal B, Wainer H. Cut-scores that minimize ultimate classification error in test batteries. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    He Q, Shin HJ, Lennon ML, Chen H, von Davier M. Producing a reliable collaborative problem solving scale in PISA. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Henderek J, Rubright J. Examining standard practice: how published reports on test characteristics align with professional guidelines. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Huh N, Xie Q, Liu C, Huang C. Exploring effectiveness of sequential monitoring procedures in detecting suspicious examinees in CAT. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Jodoin M. Key elements to effective score reporting. Simposio Internacional de Qualificacao Medica; April 2018. Sao Paulo, Brazil.
    Jurich DP. Advice for a career in measurement and research methodology. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    King A, Boulet J, Jodoin M, Philbert I, Anderson M, De Champlain A. Fundamentals of Assessment in Medical Education (FAME) course. Ottawa Conference; March 2018. Abu Dhabi, United Arab Emirates.
    Liao D, von Davier M, Rubright J. Exploring alternative scoring methods for large-scale computer-based case simulations. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Margolis MJ, Clauser BE, Schwartz A, Carraccio C, Hicks PJ. Using item statistics to revise an observational assessment instrument. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Mee J, Clauser BE, Baldwin P, Margolis M, Winward M. An experimental evaluation of structured feedback in Angoff standard setting. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Mee J, Margolis M, Winward M, Clauser B, McEllhenney S. Results of the USMLE Standard Setting Survey. Ottawa Conference; March 2018. Abu Dhabi, United Arab Emirates.
    Morriss B, Dreusicke M, King A, Manning L, Wolever R. Using health coaching skills to enhance patient engagement and autonomy. International Congress on Integrative Medicine and Health; May 2018. Baltimore, MD.
    Ouyang W, Harik P, Torok S, Clauser B. Investigation of answer change in the Step 2 Clinical Knowledge (CK) Examination. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Pak S, Welch C, Dunbar S. Comparison of short-length testlet-based CAT and MST under Rasch testlet models. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Paniagua MA, Daniels C, Hatch R, et al. Update on the NBME Shelf Exam and opportunity for input. Society of Teachers of Family Medicine: Conference on Medical Student Education; February 2018. Austin, TX.
    Park YS, Morales A, Ross LP, Paniagua MA. Reporting subscore profiles using diagnostic classification models in health professions education. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Raymond MR. Low-scoring examinees have more variable score profiles: more than just error? National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Raymond MR. Maintaining quality assessments in the face of change. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Raymond MR. Tackling practical issues in small sample scaling and equating. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Raymond MR, Clauser A. Specifying the content of credentialing examinations. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Ross L, Dekhtyar M, D’Angelo J, Guernsey J, Hawkins R. Validity of the health systems science examination: relationship between examinee performance, instruction, and curriculum design. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Runyon C, Grabovsky I. CutScore: a shiny app for the cut-score operating function. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Runyon C, Haist S, Swygert K, McSweeney D, Carracappa S. The impact of modifying item drug class information on item performance and statistics on a medical licensure exam. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Shin HJ, von Davier M, Kentaro Y. Investigating rater effects in international large-scale assessments. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    Subhiyah RG. Definición de estándares PARA exámenes de licencia (Defining standards for licensing examinations). First International Forum on the Application of Large-Scale Examinations for Professional Enforcement Purposes in the Health Area and its Impact on Public Policy; May 2018. Quito, Ecuador.
    Ulitzsch E, Pohl S, Von Davier M. Using nonresponse times to account for omitted items in competence tests. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    von Davier M, Cho Y, Pan T. Discontinuation rules in testing: new results on ignorability, local dependency, and bias. National Council on Measurement in Education (NCME) Annual Meeting; April 2018. New York, NY.
    von Davier M, Ping H. Are all data created equal? Comparability of test vs. background data across countries. Workshop on Mathematical Issues in Psychometrics, Fudan University; June 2018. Shanghai, China.
    von Davier M, Ritkowski L, Rutkowski D. Advances in measurement and research methodology of large scale international assessments. American Education Research Association (AERA) Annual Meeting; April 2018. New York, NY.
    Wolever RQ, Lawson KL, Jordan M, Subhiyah R, Schultz C, Moore M. Updates and outcomes in a pioneering field for healthcare transformation. 2018 International Congress for Integrative Medicine & Health; May 2018. Baltimore, MD.
    Adams DJ, Feinberg RA, Baldwin P. Examining the impact of time limits on classification for the USMLE. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Arena H. Reducing electric and gas consumption, cooling, and heating, in a conventional office building fitted out with a chilled beam system. Honeywell Users Group Conference; June 2017. Phoenix, AZ.
    Baldwin P, Clauser BE, Margolis MJ, Mee J, Winward M. An experimental study of the internal consistency of judgments made in bookmark standard setting. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Brittan D, Kushner R, McAllister K. Obesity medicine: building credibility and seeking certification in a stigmatized field. American Board of Medical Specialties Conference; September 2017. Chicago, IL.
    Clauser A, Foelber K. An application of multivariate generalizability theory to examine composite score reliability. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Clauser A, Foelber K. Selecting a weighting scheme for a composite score: theory and application. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Clauser A, Henderek J. Time management during a communication-centered standardized patient encounter. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Clauser A, Subhiyah R, Martin DF, Guernsey J. A fresh perspective: examination blueprint development. American Board of Medical Specialties Conference; September 2017. Chicago, IL.
    Clauser B. A history of test theory. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Clauser B, Fromme B, Hicks PJ, Margolis MJ. A novel mobile milestones-based assessment system: development, implementation, and initial outcomes. Accreditation Council for Graduate Medical Education Annual Conference; March 2017. Orlando, FL.
    Clauser B, Margolis M, von Davier M. Timing issues in simulations, games, and other performance assessments. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Feinberg R, Jurich D. Deriving rapid response thresholds for investigating test speededness. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Feinberg R, Jurich D, Foster L. Examining the impact of accessing references on a maintenance of certification examination. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Foster L, Feinberg R, Jurich D. Effects on pacing as a result of accessing references on a maintenance of certification examination. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Grabovsky I, Harik P. Impact of time constraints on performance of various item types. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Haist S, Lindsley J, Bracken-Vasquez C, Cowan T, Fulton T. Use of a reference metabolic map in assessment: updates from the NBME metabolic map task force and next steps. International Association of Medical Science Educators Annual Meeting; June 2017. Burlington, VT.
    Haist S, Navarro A, Klapholz H. Joining forces to improve the gap in caring for the military-connected. Medicine X; April 2017. Palo Alto, CA.
    Haist S, Rubright J, Indik J. ACC in-training examination predicts outcomes on the ABIM certification examination. American College of Cardiology Scientific Session and Expo; March 2017. Washington, DC.
    Harik P. Timing and examinee pacing on a test of physician licensure: experimental findings. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Harik P, Clauser BE, Grabovsky I, Bucak D, Jodoin M, Walsh W, Haist S. Assessing effects of time constraints on examinee performance on a licensing examination. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Hawley J, Gackstetter G, Raymond M, Case H, Mee J. Veterinary profession practice analysis. American Association of Veterinary State Boards Annual Meeting; September 2017. San Antonio, TX.
    He Q, Shin HJ, Lennon ML, Chen H, von Davier M. Producing a reliable collaborative problem-solving scale in PISA 2015. 82nd Annual Meeting of the Psychometric Society; July 2017. Zurich, Switzerland.
    Hicks P. Key Stakeholders Meeting - Pediatrics Milestones Assessment Collaborative Update. Association of Pediatric Program Directors Annual Meeting; April 2017. Anaheim, CA.
    Hicks P. Key Stakeholders Meeting - Pediatrics Milestones Assessment Collaborative Update. Association of Pediatric Program Directors Annual Meeting; September 2017. Arlington, VA.
    Hicks PJ, Margolis MJ, Carraccio C, Clauser BE, Winward M, Schwartz A, PMAC Module 1 Study Group. Pediatrics milestones assessment collaborative: development and implementation of an authentic workplace-based assessment system. Pediatric Academic Societies Annual Meeting; May 2017. San Francisco, CA.
    Hicks PJ, Margolis MJ, Carraccio C, Clauser BE, Winward M, Schwartz A, PMAC Module 1 Study Group. Pediatrics milestones assessment collaborative: development and implementation of an authentic workplace-based assessment system. American Board of Medical Specialties Annual Meeting; September 2017. Chicago, IL.
    Indik JH, Duhigg LM, McDonald F, Lipner RS, Rubright JD, Haist SA, Botkin NF, Kuvin JT. ACC in-training examination predicts outcomes on the ABIM certification examination. American College of Cardiology Annual Scientific Session; March 2017. Washington, DC.
    Jang H, Pak S. Meta-analysis: examining the role of race/ethnicity and gender in career choice. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Jiang Z, Raymond M. Using multivariate generalizability theory to evaluate subscore utility for different subgroups of examinees. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Jiang Z, Raymond M. Investigating the use of multivariate generalizability theory for evaluating subscores. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Khorramdel L, Pokropek A, von Davier M. Measuring response styles in rating data using multi-process IRT models. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Khorramdel L, von Davier M, Pokropek A. Mixture and multi-process IRTree models for measuring response styles. 82nd Annual Meeting of the Psychometric Society; July 2017. Zurich, Switzerland.
    Khorramdel L, von Davier M, Pokropek A. The relationship between response times and latent response style classes in noncognitive measures of cross cultural surveys. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Khorramdel L, von Davier M, Pokropek A. Differentiating between types of response styles and valid responses using mixture and multi-process IRT models. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    King A, Mazor K, Hoppe R, Kochersberger A, Yan J. Video-based communication assessment. International Conference on Residency Education; October 2017. Quebec City, Canada.
    Leventhal B, Grabovsky I, Wainer H. Test classification errors: who are we passing and who are we failing? American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Liu R, Rubright JD, Grabovsky I. Effect of item and examinee characteristics on score and response time on USMLE Step 3. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Luciw-Dubas U, Cuddy M, Harik P, Murray C, Artman C. Revisiting measurement construct definitions in high-stakes assessments in the professions: necessary challenges and practical strategies. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Margolis M, Hicks PJ, Schwartz A, Clauser BE, Carraccio C, Bruegel M. Development of a competency-based assessment system for physicians in training. MedBiquitous Annual Conference; June 2017. Baltimore, MD.
    Margolis MJ, Clauser BE. The impact of training on judge consistency for Angoff standard setting exercises. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Margolis MJ, Hicks PJ, Schwartz A, Carraccio C, Clauser BE. Development of a competency-based assessment system: a practical guide to procedural and validity considerations. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Morrison C, Ross L, Baker G, Maranki M, Fletcher B. Implementing a new score scale for the clinical science subject examinations: technical and practical considerations. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Pak S, Qian H. Applying Rasch testlet models to CAT with varied testlet characteristics. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Paniagua M. Re-examining exams: NBME effort on wellness (RENEW). ACGME Symposium on Physician Well-Being; November 2017. Chicago, IL.
    Paniagua M. Burnout and wellness: 100 days of rain essay reflection. National Academy of Medicine Global Forum; April 2017. Washington, DC.
    Paniagua M, Arnold B, Buckholz G. American Academy of Hospice and Palliative Medicine (AAHPM) review of USMLE examination series. AAHPM Annual Meeting; February 2017. Phoenix, AZ.
    Paniagua M, Morales A, Ross L, Park Y. Novel application of a diagnostic classification model (DCM) for subscore generation in NBME subject exams: a pilot study. AMEE – An International Association for Medical Education; August 2017. Helsinki, Finland.
    Pohl S, von Davier M. Using response times to deal with missing responses due to time limits. 82nd Annual Meeting of the Psychometric Society; July 2017. Zurich, Switzerland.
    Raymond MR. Competency modeling, job analysis, and test design for credentialing tests. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    Ross L, Wald D, Miller ES, Askew K, Franzen D, Lawson L, Fletcher E. Developing grading guidelines for the NBME Emergency Medicine Advanced Clinical Examination. 2017 Academic Assembly of the Council of Emergency Medicine Residency Directors; April 2017. Fort Lauderdale, FL.
    Ross LP. Measurement issues in scoring, equating, and standard setting. Association of American Medical Colleges Annual Meeting; November 2017. Boston, MA.
    Ross LP, Morrison CA. Construct irrelevant variance: examining differential speededness in Clinical Science Subject Exams. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Ross LP, Morrison CA. Construct irrelevant variance: examining test speededness for NBME clinical science subject exams. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    Ross LP, Morrison CA, Routhenstein A. Construct irrelevant variance: examining differential speededness in clinical science subject exams. Timing in Measurement and Education (TIME) Conference; October 2017. Philadelphia, PA.
    Rubright JD. Perspectives on graduate student internships. Northeastern Educational Research Association Annual Meeting; October 2017. Rocky Hill, CT.
    Schmidt W. Patterns from the future: exploration of advanced technology on user experience. STLUX Conference; September 2017. St. Louis, MO.
    Shin H, von Davier M. Understanding time usage patterns and their associations with proficiencies in international large-scale assessments. Timing in Measurement and Education (TIME) Conference; October 2017. Philadelphia, PA.
    Swygert K, Burke M, Grosso L. Validity in the context of certification examinations: challenges, successes, and more challenges. Association of Test Publishers Annual Meeting; March 2017. Scottsdale, AZ.
    Swygert K, Paniagua M, Liu R, Barone M. Response process validation of video communication items for a large-scale medical licensure exam. American Educational Research Association Annual Meeting; April 2017. San Antonio, TX.
    Ulitzsch E, Pohl S, von Davier M. A dynamic response time model for speeded tests. 82nd Annual Meeting of the Psychometric Society; July 2017. Zurich, Switzerland.
    Ulitzsch E, Pohl S, von Davier M. Using nonresponse times to investigate omitted responses. Timing Impact on Measurement in Education (TIME) Conference; October 2017. Philadelphia, PA.
    von Davier M. Methodological advances in PISA scale linking. National Council on Measurement in Education Annual Meeting; April 2017. San Antonio, TX.
    von Davier M. What is comparability and why is it important? 82nd Annual Meeting of the Psychometric Society; July 2017. Zurich, Switzerland.
    von Davier M. Research around innovative domains in large scale survey assessments. National Taiwan Normal University; September 2017. Taipei, Taiwan.
    von Davier M. Comparability of IRT scales in international assessment. University of Maryland Educational Measurement and Statistics Department Lecture Series; Novemberr 2017. College Park, MD.
    von Davier M. PISA linking and comparability in international assessments. South American Development Bank Workshop; March 2017. Washington, DC.
    von Davier M, Cho Y, Pan T. New results on ignorability of missing data due to stopping rules in ability testing. 5th Workshop on Statistical Issues in Psychometrics, Columbia University; November 2017. New York, NY.
    von Davier M, Cho Y, Pan T. New results on bias, ignorability, and violations of local dependency when using discontinue rules in intelligence testing. International Association for Computerized Adaptive Testing Conference; August 2017. Niigata, Japan.
    Baldwin SG, Baldwin PA, Keller LA. The impact of sample size on coverage properties of posterior distributions when using fully Bayesian IRT model. National Council on Measurement in Education Annual Meeting; May 2016. Denver, CO.
    Clauser A, Rick F. What score report features promote accurate remediation? Insights from cognitive interviews. National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Clauser B, Margolis M, Baldwin S. Automated scoring: an introduction for certification & licensure testing professionals. Association of Test Publishers Annual Meeting; March 2016. Orlando, FL.
    Estrada C, Morris J, Kraemer R, Castiglioni A, Clark A, Haist SA. Make your work count once, twice, thrice, twice: Scholarship opportunities for clinician-educators. Society of General Internal Medicine 39th Annual Meeting; May 2016. Fort Lauderdale, FL.
    Feinberg RA, Jurich DP. Guidelines for interpreting and reporting subscores. Northeastern Educational Research Association; March 2016. Rocky Hill, CT.
    Feinberg RA, Raymond MR. Accuracy of the person-level index for conditional subscore reporting. National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Furman GE. Interprofessional teamwork: how do we assess these competencies? Taos Institute Relational Practices Conference; November 2016. Cleveland, OH.
    Furman GE, LeBlanc K. Evaluation of clinical skills with standardized patients workshop. 20th Panamerican Conference of Medical Education; June 2016. Cancun, Mexico.
    Gessaroli ME. The validity of augmented subscores when used for different purposes. National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Katsufrakis P, Chaudry H, Dillon G, Melnick D. The value of benchmarking data from a medical licensing examination. International Association of Medical Regulatory Authorities Annual Meeting; September 2016. Melbourne, Australia.
    LeBlanc K, Dillon G, Johnson D. The validity of medical licensing examinations. International Association of Medical Regulatory Authorities Annual Meeting; September 2016. Melbourne, Australia.
    Leventhal B, Rubright JD. Why do value-added ratios differ under different scoring approaches? National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Ling Y, Swygert KA, Raymond MR. Investigating score gains for repeat examinees on USMLE Step 2 Clinical Knowledge (CK). American Association of Medical Colleges Annual Meeting; November 2016. Seattle, WA.
    Luciw-Dubas U, Harik P, Murray C, Artman C. Comparison of automated scoring algorithms developed for a patient management assessment by independent groups of practicing physicians. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Mills C. Technology as a game changer. Association of Test Publishers Annual Meeting; March 2016. Orlando, FL.
    Mills C, Hambleton R, Zumo B. Disruption in assessment: retrospective and prospective views. International Test Commission Conference; July 2016. Vancouver, Canada.
    Morrison C, Sample L, Ross L, Butler A, Smith C. Relationship between performance on NBME® Clinical Science Mastery Series Self-Assessments and Clinical Science Subject Examinations. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Morrison C, Subhiyah R, Gallagher B, Anderson M, Mendoza E. Setting Local Standards on the International Foundations of Medicine® Clinical Science Examination Based on Purpose and Context. Association for Medical Education in Europe; August 2016. Barcelona, Spain.
    Paniagua M. Unfolding case improv: employment of the art of improvisation to engage medical learners. Association for Medical Education in Europe; August 2016. Barcelona, Spain.
    Paniagua M. Expanding assessment of competencies in USMLE. Central Group on Educational Affairs of Association of Medical Colleges; April 2016. Ann Arbor, MI.
    Paniagua M. Expanding assessment of competencies in USMLE. Northeastern Group on Educational Affairs of Association of Medical Colleges; April 2016. Providence, RI.
    Paniagua M, Werner W. Assessment of interprofessional teamwork competencies: a role in accreditation systems? Institute of Medicine Global Forum; April 2016. Washington, DC.
    Raymond MR, Boulet JR, Touchie C. From course objectives to entrustable professional activities: Developing test blueprints for assessments that matter. Association for Medical Education in Europe; August 2016. Barcelona, Spain.
    Raymond MR, Case H, Billings M, Beaumont C. Competency modeling and job analysis for certification and licensure tests. Association of Test Publishers Annual Meeting; March 2016. Orlando, FL.
    Rubright JD. Both local item dependencies and cut-point location impact examinee classifications. National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Rubright JD. Current thinking in validity theory. Drug Information Association Annual Meeting; June 2016. Philadelphia, PA.
    Schultz M, Rubright JD, Tessewma A. Issues to consider when examining differential item functioning in essays scored via an automated engine. National Council on Measurement in Education Annual Meeting; April 2016. Washington, DC.
    Subhiyah RG. Understanding the evaluation process: a look into clinician knowledge assessment. Urgent Care Association of America; September 2016. Nashville, TN.
    Swygert KA. Current topics in test reliability and validity. Uniformed Services University of Health Sciences; February 2016. Bethesda, MD.
    Swygert KA. Current topics in test reliability and validity. Grand Rounds at the Uniformed Services University of Health Sciences; February 2016. Bethesda, MD.
    Wainer H. How to extract sunbeams from cucumbers. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Wainer H. How to display data badly. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Wainer H. Four easy pieces. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Wainer H, Feinberg R. How worthless subscores are causing excessively long tests. National Council of Measurement Education Annual Meeting; April 2016. Washington, DC.
    Wilkerson L, Doyle L, Anderson EL, McKinley DW, Cuddy MM, Albanese MA. Division I Fireside Chat: translating scholarship for the local and global public forum. American Educational Research Association Annual Meeting; April 2016. Washington, DC.
    Aagaard E, Haist SA, Windish DT. TEACH 201 precourse. Society of General Internal Medicine; April 2015. Toronto, Canada.
    Anderson MB. Best practices in knowledge and skills assessment of health professionals. Medical Education Reform Conference; November 2015. Astana, Kazakhstan.
    Anderson MB. Essential skills in medical education assessment course. Association for Medical Education in Europe; September 2015. Glasgow, Scotland.
    Anderson MB. Formative and summative assessment: striking the balance. International Conference on Medical Education; October 2015. Istanbul, Turkey.
    Anderson MB. Fundamentals of educational leadership skills to lead change in healthcare. International Conference on Medical Education; October 2015. Istanbul, Turkey.
    Anderson MB. The professional evaluation of health coach. Smart Healthy City; September 2015. Hangzhou, People's Republic of China.
    Anderson MB, Crow S, Perkowski L, Simpson D. Advancing the scholarship of medical education. Association of American Medical Colleges Annual Meeting; November 2015. Baltimore, MD.
    Anderson MB, Glicken A. Catalyzing change: successful strategies for wicked problems. Association for Medical Education in Europe; September 2015. Glasgow, Scotland.
    Baldwin P. Toward an optimal proficiency estimator. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Baldwin S, Baldwin P, Harik P, Clauser BE. Partitioning continuous item score scales when modeling with IRT. National Council on Measurement in Education Annual Meeting; March 2015. New York, NY.
    Baldwin SG, Harik P, Clauser BE, Winward M, Baldwin P. Automated scoring of patient notes in a medical licensure examination. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Butler A, Raymond MR. Developing a high-quality item pool to support integrative basic science exams. Annual Meeting of the International Association of Medical Science Educators; June 2015. San Diego, CA.
    Chaffinch C, Rebbecchi T. Surviving the next 100 years: what NPD success looks like for a well-established NFP organization. Back End of Innovation; October 2015. San Jose, CA.
    Cuddy MM, Michalec B, Bell AV. “You kind of have to be a little bit of a mother hen.” An examination of gender socialization in medicine. Eastern Sociological Society; February 2015. New York, NY.
    Cuddy MM, Michalec B, Swanson DB. Restricted opportunities: an exploration of electronic health record use by women in medical school. Pacific Sociological Association; April 2015. Long Beach, CA.
    Cuddy MM, Ouyang W, Phung J, Swanson DB. Effects of prior clerkship experience on end-of-clerkship performance in pediatrics and psychiatry. American Educational Research Association Annual Meeting; April 2015. Chicago, IL.
    Feinberg RA, Soto AC. Can item keyword feedback help remediate knowledge gaps? American Educational Research Association Annual Meeting; April 2015. Chicago, IL.
    Feinberg RA, Wainer H. When can we improve subscores by making them shorter? The case against subscores with overlapping items. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Fitz M, Ross LP. NBME shelf exam: what do we (and our students) really know: past, present, and future. Alliance for Academic Internal Medicine; October 2015. Atlanta, GA.
    Gessaroli ME. Want to report subscores? Some things to think about. Innovations in Testing; March 2015. Palm Springs, CA.
    Kochersberger AO, King AM, Mazor KM, Hoppe RB. Using examinees’ observable behaviors versus raters’ subjective evaluations to assess communication skills. International Conference on Communication in Healthcare; October 2015. New Orleans, LA.
    Komatsu R, Padilha R, Anderson MB, Subhiyah R. ADEM Plus: performance assessment of medical students in Brazil. Association for Medical Education in Europe; September 2015. Glasgow, Scotland.
    Lane S, Raymond MR, Haladyna TM, Wise L. Handbook of test development: advances in item development. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Lane S, Raymond MR, Haladyna TM, Wise L. The Handbook of Test Development: Advances in item development. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Margolis MJ, Clauser BE, Mee J, Clauser JC. Effect of content knowledge on the precision of Angoff judgment. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Morrison CA, Phebus J, Anderson B, Brown-Hunter N. An investigation of pacing on the International Foundations of Medicine (IFOM) Clinical Science Examination for examinees testing in multiple languages. Association for Medical Education in Europe; September 2015. Glasgow, United Kingdom.
    Morrison CA, Ross LP, Vasinda J, Smith C, Butler AR. Relationship between performance on the National Board of Medical Examiners Comprehensive Clinical Medicine Self-Assessment and United States Medical Licensing Examination Step 3 for US and international medical school graduates. American Educational Research Association Annual Meeting; April 2015. Chicago, IL.
    Raymond MR. Identifying sources of construct-irrelevant variance in performance testing. Association of Test Publishers Annual Meeting; April 2015. Palm Springs, CA.
    Raymond MR, Cavallin N. The development of knowledge requirement scales in the health professions. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Raymond MR, Feinberg RA. Subscores aren’t for everyone: alternate strategies for evaluating subscore utility. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Subhiyah R, Smith C, Dupras D, McDonald E, Jurich D, Gallegher A. Will changing the testing medium affect performance? Association for Medical Education in Europe; September 2015. Glasgow, Scotland.
    Swygert KA. Psychometric principles for medical educators. Vice Deans Series on Teaching Excellence; June 2015. Johns Hopkins School of Medicine, Baltimore, MD.
    Swygert KA. Psychometric principles for medical educators. Vice Dean's Series on Teaching Excellence; June 2015. Johns Hopkins School of Medicine, Baltimore, MD.
    Swygert KA, Raymond MR, Vasinda J, Baker A, Lord J, Feinberg RA. The use of multiple strategies for developing subscores for a large-scale medical licensure exam. Northeastern Educational Research Association Annual Meeting; October 2015. Trumbull, CT.
    Wainer H. Arguing for more experimentation in education: four studies that need doing. American Educational Research Association Annual Meeting; April 2015. Chicago, IL.
    Wainer H. Discussion of David Thissen’s item response theory; serendipity; and bad questions. National Council on Measurement in Education Annual Meeting; April 2015. Chicago, IL.
    Anderson MB. Essential skills in medical education assessment course. Association for Medical Education in Europe; August 2014. Milan, Italy.
    Anderson MB, Glicken A. Catalyzing change: successful strategies for engagement. Association for Medical Education in Europe; August 2014. Milan, Italy.
    Baldwin P. Using likelihood ratio tests to evaluate the uniqueness of subscores. National Council on Measurement in Education Annual Meeting; April 2014. Philadelphia, PA.
    Butler A. Blueprinting an integrated examination. Association of Medical School Microbiology and Immunology Chairs; May 2014. Myrtle Beach, SC.
    Cuddy MM, Ouyang W, Swanson DB. Do prior clerkship experiences affect performance on the end-of-clerkship examination in family medicine? American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    Cuddy MM, Wallach PM, Holtzman KZ, Swanson DB. Use of electronic health records by medical students. American Educational Research Association; April 2014. Philadelphia, PA.
    De Champlain AF, Eva K, Anderson MB, Holmboe E, Southgate L, Bowmer I. Bridging the gap: how medical education and measurement science can better collaborate to meet growing and broadening assessment needs. Ottawa Conference: Transforming Healthcare through Excellence in Assessment and Evaluation; April 2014. Ottawa, Canada.
    Feinberg RA, Raymond MR, Haist SA. If at first you don’t succeed, retest on a different form. American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    Feinberg RA, Rubright JD. How to fake it: considerations for implementing simulations in psychometrics. National Council on Measurement in Education Annual Meeting; April 2014. Philadelphia, PA.
    Feinberg RA, Russell J. All score points are not created equal: anchor set composition for licensure examinations. American Educational Research Association; April 2014. Philadelphia, PA.
    Furman GE. Clinical skills assessment with standardized patients. International Congress of Medical Education; June 2014. Puerto Vallarta, Mexico.
    Grabovsky I, Jodoin MG, Morrison CA, Phebus J. Investigating the effect of motivation on the results of a high stakes medical examination. American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    King AM, Mazor KM, Kochersberger A, Hoppe RB, Furman GE. Targets for improvement: medical educators’ views on trainees’ communication skills. International Conference on Communication in Healthcare; September 2014. Amsterdam, Netherlands.
    LeBlanc KE. Assessing clinical competency with standardized patients: an invitation for ongoing dialogue. International Association of Medical Regulatory Authorities; September 2014. London, England.
    Mee J, Clauser BE, Margolis MM, Winward M. The impact of correct answer indication in an Angoff exercise. National Council on Measurement in Education Annual Meeting; April 2014. Philadelphia, PA.
    Mee J, Harik P, Cuddy M, McEllhenney S, Dillon G. The Use of Literature Searching, Literature Analysis and Pharmaceutical Advertisements in Medical Education. AAMC Research In Medical Education (RIME) Conference; November 2014. Chicago, IL.
    Melnick DE. High stakes external assessment: how do we keep the baby while throwing out the bath water? Association of American Medical Colleges; November 2014. Chicago, IL.
    Morrison CA. What do you do when the unexpected happens: perspectives from the licensing of physicians. American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    Morrison CA, Phebus J, Anderson MB, Woodward S. The relationship between performance on the International Foundations of Medicine (IFOM) Clinical Science Examination and the United States Medical Licensing Examination (USMLE) Step 2 Clinical Knowledge examination for international and US medical students and graduates. Association for Medical Education in Europe; September 2014. Milan, Italy.
    Raymond MR, Fabrey LJ, Azari ED, Oppler SH. Maintaining test form comparability in an era of constant change. Association of Test Publishers; March 2014. Scottsdale, AZ.
    Raymond MR, Ling L, Grabovsky I. The impact of test item length on test performance for international and second language medical students. Association of American Medical Colleges Annual Meeting; November 2014. Chicago, IL.
    Soto AC. Using student test scores to evaluate teacher performance: validity and reliability evidence. American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    Stoffel H, Raymond MR, Bucak S, Haist SA. Editorial changes and item performance: implications for recalibration and pretesting. American Educational Research Association Annual Meeting; April 2014. Philadelphia, PA.
    Swygert KA, Raymond MR. Practice effects in a performance assessment of physician clinical skills. National Council on Measurement in Education Annual Meeting; April 2014. Philadelphia, PA.
    Swygert KA, Rhynes J, Clyman SG. Exploring the role of assessment in practice and regulation. Federation of State Medical Boards (FSMB) Annual Meeting; April 2014. Denver, CO.
    Wainer H. How to detect cheating badly. Annual Joint Statistical Meetings; August 2014. Boston, MA.
    Wainer H. Uneducated guesses: three examples of how mistreating missing data yields misguided educational policy. Association of Test Publishers Annual Meeting; March 2014. Phoenix, AZ.
    Wenghofer EF, Henzel TR, Miller S, Norcross WA, Boal P. The value of a general medical science examination in a comprehensive competence assessment for practicing physicians. Canadian Conference on Medical Education; April 2014. Ottawa, Canada.
    Aagaard E, Basaviah P, Chheda S, Haist SA, Karani R, Knight C, Rosenblum M, Spencer A, Windish D. TEACH precourse. Society of General Internal Medicine; April 2013. Denver, CO.
    Anderson MB. Essential skills in medical education assessment course. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Anderson MB. Medical education: changes and chances. Association for Medical Education in the Eastern Mediterranean Region; April 2013. Riyadh, Saudi Arabia.
    Anderson MB. Identifying successful strategies for change in medical education. Association for Medical Education in the Eastern Mediterranean Region; April 2013. Riyadh, Saudi Arabia.
    Anderson MB, Centeno A. Evaluation of professionalism: foundations of evaluation in medical education. Panamerican Conference of Medical Education; July 2013. Quito, Ecuador.
    Anderson MB, Hossam H, Costa M. Using "appreciative inquiry" to address change anxieties. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Angelucci K, Holtzman KZ, Hussie K, Swanson DB. Practical considerations in incorporating multimedia into your exams: it’s more involved than just shooting some video. Association of Test Publishers Annual Meeting; February 2013. Fort Lauderdale, FL.
    Baker A, Raymond MR, Haist SA, Boulet J. A guide to the use of national healthcare utilization databases for health professions education. American Educational Research Association Annual Meeting; April 2013. San Francisco, CA.
    Baker A, Raymond MR, Haist SA, Boulet J. Use of national healthcare utilization databases for medical education. Association of American Medical Colleges Annual Meeting; November 2013. Philadelphia, PA.
    Baldwin P. Weighting components of a composite score using naïve expert judgments about subtest importance. National Council on Measurement in Education Annual Meeting; April 2013. San Francisco, CA.
    Boursicot K, Swanson DB, Johnson K. Effects of changing from checklist to rating scale scoring for OSCEs. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Clauser BE. Computer-based case simulations for assessment in medical licensing. Invitational Research Symposium on Science Assessment; September 2013. Washington, DC.
    Cuddy MM, Drake RL, Pawlina W, Swanson DB. A multilevel analysis of gross anatomy instructional characteristics and performance on a national licensing examination in medicine. American Educational Research Association Annual Meeting; April 2013. San Francisco.
    Feinberg RA, Clauser BE. Item relevance in the modified Angoff standard setting exercise. Research in Medical Education; November 2013. Philadelphia, PA.
    Gessaroli ME. Defining and comparing different augmented subscores. National Council on Measurement in Education Annual Meeting; April 2013. San Francisco, CA.
    Grosso L, Decker M, Goodwin AL, Carson J, Lipner R, Azari ED. Challenges faced across the professions by programs for maintenance of credentials. American Educational Research Association Annual Meeting; April 2013. San Francisco, CA.
    Harik P, Clauser BE, Murray C, Artman C, Veneziano A, Margolis MJ. Comparison of automated scores derived from independent groups of content experts. National Council on Measurement in Education Annual Meeting; April 2013. San Francisco, CA.
    Kahraman N, Harik P, Cuddy MM, Clauser BE. Information available in item review patterns when evaluating test speededness: a USMLE Step 2 Clinical Knowledge examination example. National Council on Measurement in Education Annual Meeting; April 2013. San Francisco, CA.
    Levine RE, Carchedi L, Roman B, Townsend M, Cluver J, Frank J, Butler A, Haidet P, Swanson D, Thompson B. A comparison of team cohesiveness and performance in team-based learning: teams versus conventional student groups. Southern Group on Educational Affairs; April 2013. Savannah, GA.
    Luciw-Dubas U, Harik P, Henzel TR, Clauser BE. Using multiple choice questions to assess medical professionalism: can it be done? Association of American Medical Colleges; November 2013. Philadelphia, PA.
    Luciw-Dubas U, Harik P, Henzel TR, Clauser BE. Can multiple choice questions be used to assess medical professionalism: an empirical study. American Educational Research Association Annual Meeting; April 2013. San Francisco, CA.
    Mee J, Fletcher EA, Taylor C, Butler A. Computer simulation use in clerkships. Association of American Medical Colleges Annual Meeting; November 2013. Philadelphia, PA.
    Morrison CA, Ross LP, Sample L, Butler A. Relationship between performance on the NBME Comprehensive Clinical Sciences Self-Assessment and USMLE Step 2 Clinical Knowledge. American Educational Research Association Annual Meeting; April 2013. San Francisco, CA.
    Nungester RJ, Abrams T, Padaki M, Plake B, Wagner D, Browne M. Our changing industry: plan today for the assessment industry of tomorrow. Association of Test Publishers Annual Meeting; February 2013. Ft. Lauderdale, FL.
    Raymond MR, Haist SA, Mee J, Dillon G. Charting the diversity of internal medicine: Results of a nationwide practice analysis. Society of General Internal Medicine Annual Meeting; May 2013. Denver, CO.
    Spencer A, Haist SA, Rosenblum M. Writing goals and objectives for teaching. Society of General Internal Medicine; April 2013. Denver, CO.
    Swanson DB. Purposes and types of evaluation: design of evaluation systems for undergraduate medical education. Panamerican Conference; July 2013. Quito, Ecuador.
    Swanson DB. Development and use of progress testing. Panamerican Conference on Medical Education; August 2013. Quito, Ecuador.
    Swanson DB. Basic psychometrics. Essential skills in medical education assessment course. Association for Medical Education in Europe Annual Meeting; August 2013. Prague, Czech Republic.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2013. Kuala Lumpur, Malaysia.
    Swanson DB, Holtzman KZ, Wilkes M. Integrated cases: promises, pitfalls, and progress in the development of a “new” simulation format to assess hard-to-measure competencies. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Swanson DB, Johnson K, Oliveira D, Hayes K, Boursicot K. Estimating the reproducibility of OSCE scores when exams involve multiple circuits. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Swygert K. Generalizability theory in the clinical skills environment. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Swygert KA, Chavez A, Raymond MR. Score gains for repeat international medical graduates on the United States Medical Licensing Examination Step 2 CS. Association for Medical Education in Europe; August 2013. Prague, Czech Republic.
    Wainer H. Two ideas we need to teach the media (and everyone else). Joint Statistical Meetings; August 2013. Montreal, Canada.
    Williams WT, Spector Y, Swygert KA. Writing a successful Stemmler Fund application. American Association of Medical Colleges Annual Meeting; November 2013. Philadelphia, PA.
    Anderson MB. Curriculum change. Qatar Medical Education Conference; January 2012. Doha, Qatar.
    Anderson MB. Curriculum symposium. Saudi Arabia International Medical Education Conference; April 2012. Riyadh, Saudi Arabia.
    Anderson MB. Essential skills in medical education assessment course. Association for Medical Education in Europe; August 2012. Lyon, France.
    Anderson MB. Leadership in medical education. Qatar Medical Education Conference; January 2012. Doha, Qatar.
    Anderson MB. Leadership in medical education. Saudi Arabia International Medical Education Conference; April 2012. Riyadh, Saudi Arabia.
    Anderson MB, Koeppen B, Katz P, Rebolli A. New schools: developing relationships. Association of American Medical Colleges Annual Meeting; November 2012. San Francisco, CA.
    Angelucci K, Holtzman KZ, Hussie K, Swanson DB. Practical considerations in incorporating multimedia into your exams: it’s more involved than just shooting some video. Association of Test Publishers Annual Meeting; February 2012. Palm Springs, CA.
    Arora N, Street R, Mazor K, King AM. Designing assessment systems to monitor the quality of patient-clinician communication in healthcare organizations: methodological considerations. International Conference on Communication in Healthcare; September 2012. St. Andrews, Scotland.
    Azari E. Protecting the integrity of the USMLE program candidate communication and agreements. American Board of Medical Specialties Annual Meeting; October 2012. Chicago, IL.
    Azari E. Protecting the integrity of the USMLE program problem behavior. What happens next? American Board of Medical Specialties; October 2012. Chicago, IL.
    Baldwin SG, Harik P, Swygert KA, Clauser BE, Rebbecchi TA. Assessing the psychometric impact of enhancements to the documentation component of the USMLE Step 2 CS. National Council for Measurement in Education; April 2012. Vancouver, Canada.
    Beck MD, Hsieh MC, Clauser BE, Foley BP, McClarty KL, Davis-Becker SL, Wyse AE, Perie M. Standard setting session. American Educational Research Association Annual Meeting; April 2012. Vancouver, Canada.
    Boursicot K, Fuller R, Norcini JJ, Smee S, Furman GE. Performance assessment. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Boursicot K, Smee S, Swanson DB, Patterson J. 10 years of OSCE security issues: is there a problem? Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Boursicot K, Swanson DB, Patterson J, Smee S. Re-use of OSCE stations over several years: does student performance improve? Association for Medical Education in Europe; August 2012. Lyon, France.
    Boursicot K, Swanson DB, Patterson JD. Do OSCE station pass marks set by the Borderline Groups and Angoff methods remain stable over time? Association for Medical Education in Europe; August 2012. Lyon, France.
    Brown CB, Kahraman N, Sanger JM. An evaluation of standardized patient performance over time. American Educational Research Association; April 2012. Vancouver, Canada.
    Bucak S, Haist SA, Luecht R. Automated test assembly: from self-assessment to high-stakes credentialing. Association of Test Publishers Annual Meeting; February 2012. Palm Springs, CA.
    Chow J, Swanson DB, Johnson K, Morrison C, Low A. Progress test or not: that is the question. Association for Medical Education in Europe; August 2012. Lyon, France.
    Clauser BE, Clauser JC. An examination of the replicability of Angoff standard setting results within a generalizability theory framework. National Council on Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    Clyman SG, Rose K. Milestones pilot-update. Association of Pediatric Program Directors; October 2012. Arlington, VA.
    Collares C, Machado J, Vendramini C, Mennin S, van der Vleuten C, Swanson DB. Psychometric impacts of technical item writing flaws in progress testing. Association for Medical Education in Europe; August 2012. Lyon, France.
    Cook R, Baldwin SG, Clauser BE. An NLP-based approach to automated scoring of the USMLE, Step 2 CSE patient note. National Council on Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    Cuddy MM, Drake RL, Pawlina W, Swanson DB. Impact of differences in gross anatomy instruction on performance on the United States Medical Licensing Examination (USMLE). Association of American Medical Colleges Annual Meeting; November 2012. San Francisco, CA.
    Cuddy MM, Drake RL, Pawlina W, Swanson DB. The impact of differences in gross anatomy instruction on performance on the United States Medical Licensing Examination (USMLE). Association of American Medical Colleges; November 2012. San Francisco, CA.
    Dillon GF, Katsufrakis PJ, Melnick DE. Medical regulation and licensure - keys to valid assessment. International Association of Medical Regulatory Authorities; October 2012. Ottawa, Ontario, Canada.
    Estrada C, Castiglioni A, Aagaard E, Haist SA, Kraemer R, Morris J, Willett L. Make your work count twice: scholarship opportunities for clinician-educators. Society of General Internal Medicine; May 2012. Orlando, FL.
    Feinberg RA, Wainer H, Nandakumar R, Jodoin MG. Subscores that add value in a licensure context: a simulation study. National Council on Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    Galbraith RM. A new emphasis on formative approaches to assessment. International Meeting on Simulation in Healthcare; January 2012. San Diego, CA.
    Galbraith RM. What is good assessment: written and performance assessment. Saudi Arabia International Medical Education Conference; April 2012. Riyadh, Saudi Arabia.
    Galbraith RM. The winds of change: more measurement, less tests. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Gessaroli ME. Scoring and reporting issues with CSEs and workplace assessments. Fundamentals of Assessment in Medical Education (FAME). Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Gorin JS, Leighton JP, Marcoulides GA, Clauser BE. Thematic orientations and successful publication strategies for selected measurement journals. American Educational Research Association Annual Meeting; April 2012. Vancouver, Canada.
    Grabovsky I, Jodoin MG, Phebus J. Assessing the impact of preparedness and motivation on the results of a high-stakes medical examination. Association for Medical Education in Europe; August 2012. Lyon, France.
    Haist SA. Best practices of assessment: it is best to start at the beginning. Southern Group on Educational Affairs; April 2012. Lexington, KY.
    Holtzman KZ. Writing better multiple-choice questions for your exams: essential skills in medical education assessment. Association for Medical Education in Europe; August 2012. Lyon, France.
    Holtzman KZ, Swanson DB. Developing high-quality single-best-answer MCQs to assess application of knowledge using patient vignettes. Association for Medical Education in Europe; August 2012. Lyon, France.
    Holtzman KZ, Swanson DB. Writing MCQs in challenging areas. Association for Medical Education in Europe; August 2012. Lyon, France.
    Holtzman KZ, Swanson DB, Ouyang W, Dillon GF, Boulet J. International variation in performance by clinical discipline and task on the USMLE Step 2 Clinical Knowledge (CK). Association for Medical Education in Europe; August 2012. Lyon, France.
    Jobe AC. Improving the quality of healthcare: the role of accreditation and assessment. 10th Anniversary Symposium, Tokyo Medical and Dental University; July 2012. Tokyo, Japan.
    Jobe AC. Written assessment. Association for Medical Education in Europe; August 2012. Lyon, France.
    Jodoin MG. Considerations for operating and maintaining a computer-based test. Association of Test Publishers Annual Meeting; February 2012. Palm Springs, CA.
    Jodoin MG. Fundamentals of differential item functioning. Teach Your Children Well: A Conference Honoring Ronald K Hambleton; November 2012. Amherst, MA.
    Kahraman N, Brown C, Sanger JM. A closer look at the psychometric characteristics of tasks in a standardized patient examination: examining patient-rater and clinical content related effects on task parameters. National Council on Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    King AM, Boulet J, Gessaroli M, Swanson DB. Fundamentals of assessment in medical education (FAME) course. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Levine R, Borges N, Butler A, Swanson DB, Thompson B. Students’ experience taking a high-stakes team test. Generalists in Medical Education; November 2012. San Francisco, CA.
    Levine R, Borges N, Haidet P, Butler A, Swanson DB, Thompson B. Student experiences taking a high-stakes team test. Southern Group on Educational Affairs; April 2012. Lexington, KY.
    Margolis MJ, Clauser BE, Mee J, Winward M, Dillon GF. Agreement between judges’ expectations for USMLE performance and Angoff standard setting results. Association of American Medical Colleges Annual Meeting; November 2012. San Francisco, CA.
    Margolis MJ, Harik P, Murray CT, Veneziano A, Clauser BE. Lessons learned in the process of ten years’ experience scoring computer-delivered constructed-response items. American Educational Research Association Annual Meeting; April 2012. Vancouver, Canada.
    Mee J, Raymond MR, Haist SA, Dillon GF, Katsufrakis PJ, Young A, Johnson D. The clinical activities of recently licensed physicians in rural, suburban and urban settings. Federation of State Medical Boards; April 2012. Fort Worth, TX.
    Melnick DE. Keynote address. Third International Congress on Medical Education; June 2012. Puerta Vallarta, Mexico.
    Melnick DE. Mobility, Erasmus Exchange tuning. Third International Congress on Medical Education; June 2012. Puerta Vallarta, Mexico.
    Melnick DE. Prospects on training physicians of the future. Third International Congress on Medical Education; June 2012. Puerta Vallarta, Mexico.
    Raymond MR, Swygert KA, Kahraman N. Conditional SEMs for examinees who repeat performance assessments. National Council for Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    Rebbecchi TA, Boulet J, Kirchoff M. Systems-based practice core competency assessment using simulation-based exercises. International Meeting on Simulation in Healthcare; January 2012. San Diego, CA.
    Rebbecchi TA, Richmond M. Rater training and providing meaningful feedback for the Pediatric Milestones Project. Association of Pediatric Program Directors; October 2012. Arlington, VA.
    Schuwirth L, Swanson DB, Onishi H. Research in assessment: consensus statements of the Ottawa Conference. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; March 2012. Kuala Lumpur, Malaysia.
    Short K, Hussie K, Angelucci K. Beyond the item bank: one organization's experience with initiating a system for accessing multimedia for item writing and client reviews. Association of Test Publishers Annual Meeting; February 2012. Palm Springs, CA.
    Swanson DB. Basic psychometrics. Essential skills in medical education assessment course. Association for Medical Education in Europe; August 2012. Lyon, France.
    Swanson DB, Feinberg RA, Swygert KA, Dillon GF, Holtzman KZ, Raymond MR, Haist SA. Differences in USMLE Step 3 performance by setting and specialty. Research in Medical Education; November 2012. San Francisco, CA.
    Swanson DB, Holtzman KZ, Wilkes M, Norman GR. Integrated cases: a “new” method for assessing clinical competence. Association of American Medical Colleges Annual Meeting; November 2012. San Francisco, CA.
    Swanson DB, Ouyang W, Holtzman KZ, Johnson M, Haist SA. Comparison of performance in biostatistics and epidemiology across USMLE Steps 1, 2, and 3. Research in Medical Education; November 2012. San Francisco, CA.
    Swygert K, Jobe A. Measurement of clinical skills: advanced topics. Association for Medical Education in Europe; September 2012. Lyon, France.
    Swygert KA, Baldwin SG, Rebbecchi TA, Scott CL, Furman GE, Sanger JM. The impact of changes to the written communication construct on examinee performance and pacing: findings from the USMLE Step 2 Clinical Skills 2011 pilot examinations. National Council for Measurement in Education Annual Meeting; April 2012. Vancouver, Canada.
    Wainer H. Detection of aberrant responses. Conference on Statistical Detection of Potential Test Fraud; May 2012. Lawrence, KS.
    Wainer H. How to detect cheating badly. Keynote address. Conference on Statistical Detection of Potential Test Fraud; May 2012. Lawrence, KS.
    Wainer H. On the role of replication in the advance of science: the survival of the fittist. Statistical Methods in Diagnostic Assessments; April 2012. New York, NY.
    Wallach PM, Cuddy MM, Swanson DB, Holtzman KZ. Inpatient and outpatient settings as components of core clerkships: Results of a survey of US students attending LCME-accredited schools. Association of American Medical Colleges; November 2012. San Francisco, CA.
    Wallach PM, Cuddy MM, Swanson DB, Holtzman KZ. Use of electronic and paper health records during core clerkships: Results of a survey of US students attending LCME-accredited schools. Association of American Medical Colleges; November 2012. San Francisco, CA.
    Azari E. NBME—Protecting the Public—Assessing Health Professionals Worldwide. Prometric International Summit; May 2011. Newmarket-on-Fergus, Ireland.
    Babcock B, Raymond MR, Yoes M. Big things come in small packages: psychometrics for low-volume testing programs. Association of Test Publishers Annual Meeting; February 2011. Phoenix, AZ.
    Baldwin P. On the mean-sigma estimators and bias. National Council on Measurement in Education Annual Meeting; April 2011. New Orleans, LA.
    Baldwin P, Baldwin S. F-type testlets and the effects of feedback and case-specificity. Association of American Medical Colleges Annual Meeting; November 2011. Denver, CO.
    Butler A, Cannarozzi M, Scoles PV, Wilson-Delfosse A. Developing examinations for integrated courses. International Association of Medical Science Educators; June 2011. St. Petersburg, FL.
    Canavan C, Schaff P, Trial J. Giving feedback to trainees on professional behavior. Accreditation Council for Graduate Medical Education; March 2011. Nashville, TN.
    Cuddy MM, Swygert KA, Jobe A, Swanson DB. A multilevel analysis of examinee gender, standardized patient gender, and United States Medical Licensing Examination Step 2 Clinical Skills Communication and Interpersonal Skills scores. Association of American Medical Colleges Annual Meeting; November 2011. Denver, CO.
    Feinberg R, Swygert KA, Haist SA, Dillon GF, Murray CT. Effect of postgraduate training on the USMLE Step 3 examination computer-based case simulation (CCS) component. American Educational Research Association Annual Meeting; April 2011. New Orleans, LA.
    Furman GE, Jobe AC, Rebbecchi TA. High-stakes assessment of clinical skills—improving goals and methods. Association of American Medical Colleges Annual Meeting; November 2011. Denver, CO.
    Galbraith RM. Advancing competency-based learning and assessment: the eFolio connector. Association of American Medical Colleges Annual Meeting; March 2011. Washington, DC.
    Holtzman KZ, Marcus R, Herkowitz H, O’Keefe R, Swanson DB, D’Angelo J, Morrison C, Hurwitz S. Introduction of multimedia on the ABOS Part I certifying examination: a controlled trial of the impact on item characteristics. American Academy of Orthopaedic Surgeons; February 2011. San Diego, CA.
    Kahraman N, Harik P, Cuddy MM, Clauser BE. Modeling response times and examinees’ pacing behavior using latent growth curve models. National Council on Measurement in Education Annual Meeting; April 2011. New Orleans, LA.
    King AM. Standardized patients: the first and second half centuries. Association of Standardized Patient Educators; June 2011. Nashville, TN.
    Levine RE, Carchedi L, Roman B, Townsend M, Butler A, Haidet P, Swanson DB, Thompson B. Taking a team shelf exam: do “better” teams do better? Association of Directors of Medical Student Education in Psychiatry; June 2011. Savannah, GA.
    Nungester RJ. Lessons learned from introducing a CAT program. International Association for Computerized Adaptive Testing; October 2011. Pacific Grove, CA.
    Nungester RJ, Clauser BE, Harik P. Automated scoring of simulations in medical licensure. Council on Licensure Enforcement and Regulation Conference; September 2011. Pittsburgh, PA.
    Raymond MR. Commentary on the draft revised Standards for Educational and Psychological Testing. National Council on Measurement in Education Annual Meeting; April 2011. New Orleans, LA.
    Raymond MR, Grabovsky I. Conditional standard errors for performance ratings from least-squares regression. National Council on Measurement in Education Annual Meeting; April 2011. New Orleans, LA.
    Raymond MR, Kahraman N, Swanson DB, Balog KP. Score gains on performance tests for repeat examinees: an evaluation of construct and criterion-related evidence. American Educational Research Association Annual Meeting; April 2011. New Orleans, LA.
    Ross LP, Butler A. What new and innovative services are being developed for assessing clinical reasoning? Society of Teachers of Family Medicine; January 2011. Houston, TX.
    Swanson DB. Assessing understanding of patient safety on cognitive exams: examples from USMLE. American Board of Medical Specialties Annual Meeting; October 2011. Chicago, IL.
    Swanson DB. Basic psychometrics. Essential skills in medical education assessment course. Association for Medical Education in Europe; August 2011. Vienna, Austria.
    Swanson DB. Research in assessment: lessons learned in development of clinical simulations. Conference on Research in Medical Education: State of the Art; October 2011. Mexico City, Mexico.
    Swanson DB, Holtzman KZ, Bucak SD, Sawhill AJ, Morrison C, Hurwitz S, DeRosa DP, Marsh JL. Utility of AAOS OITE scores in predicting ABOS Part I outcomes. American Academy of Orthopaedic Surgeons; February 2011. San Diego, CA.
    Swanson DB, Holtzman KZ, Grabovsky I, Phebus J, Angelucci K. Country-to-country variation in item difficulty on the 2010 English-language version of the International Foundations of Medicine (IFOM) clinical science examination. Association for Medical Education in Europe; August 2011. Vienna, Austria.
    Swygert K. Measurement of clinical skills: advanced topics. Association for Medical Education in Europe; September 2011. Vienna, Austria.
    Swygert KA, Muller ES, Van Zanten M, Haist SA, Jobe AC. Gender differences in performance on the Step 2 Clinical Skills data gathering (DG) and patient note (PN) components. American Educational Research Association Annual Meeting; April 2011. New Orleans LA.
    Wainer H. Uneducated guesses: three examples of how a little data can reduce a lot of errors in educational policy. Joint Statistical Meetings; August 2011. Miami, FL.
    Haist SA, Swanson DB, Bucak SD, Sawhill AJ, Holtzman KZ, Cuddy MM, Dillon GF. Do medical students taking USMLE Step 2 CK later in the academic year do better or worse—it depends on how you look at it. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Azari E, Carson JD, Dunham M, Farmer C, Loew R. Test accommodations under the ADA—dazed and confused in 2010? Council on Licensure, Enforcement and Regulation; September 2010. Nashville, TN.
    Baldwin PA. Developing a common metric in item response theory when parameter posterior distributions are known. National Council on Measurement in Education Annual Meeting; May 2010. Denver, CO.
    Baldwin PA, Keller R, Cook R, Baldwin SG. Developing common metric in IRT when no link exists. National Council on Measurement in Education Annual Meeting; May 2010. Denver, CO.
    Butler A. Overview of NBME assessments and evaluation tools for international medical schools and students. Innovations in Medical Education, Medical Simulation and E-Learning; October 2010. Poznan, Poland.
    Clauser BE, Harik P, Luciw-Dubas UA, Murray C, O’Donovan S, Feinberg R. Quality control for the automated scoring of computer-based case simulations in the United States Medical Licensing Examination. National Council on Measurement in Education; April 2010. Denver, CO.
    Clyman SG. Connecting education and performance improvement to meaningful use. Office of the National Coordinator for Health Information Technology; October 2010. Washington, DC.
    Cromley JG, Wills TW, Stephens CR, Dumas D, Herring MH, Luciw-Dubas UA, Snyder-Hogan LE, Burton D, Mendelsohn T. A content analysis of images in biology and geoscience textbooks. National Association for Research in Science Teaching; March 2010. Philadelphia, PA.
    Errichetti A, Furman GE, Santen S, Shatzer J. Sailing the unknown seas: perils and pearls of standardized patient assessment. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Furman GE, Lehrman A. Leadership skills for training, management and beyond. Association of Standardized Patient Educators; June 2010. Baltimore, MD.
    Galbraith RM. Measurement and assessment of healthcare professionals. National Institute for Quality Improvement and Education; February 2010. Philadelphia, PA.
    Haist SA, Swanson DB, Bucak D, Sawhill AJ, Holtman KZ, Cuddy MM, Dillon GF. Do students taking Step 2 CK later do better or worse? It depends on how you look at it. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Haist SA, Swanson DB, Holtzman KZ, Grande J. The scientific foundations of medicine: going beyond the first two years of medical school. International Association of Medical Science Educators; July 2010. New Orleans, LA.
    Harik P, Clauser BE, Baldwin PA. Comparison of alternative scoring methods for a computerized performance assessment of clinical judgment. National Council on Measurement in Education Annual Meeting; May 2010. Denver, CO.
    Holtzman KZ, Sousa N, Costa M, Swanson DB, Grabovsky I, Phebus J, Angelucci K, Pannizzo L, Jodoin M, Scoles PV. Portuguese/English translation effects in the International Foundations of Medicine. Association for Medical Education in Europe; September 2010. Glasglow, Scotland.
    Holtzman KZ, Swanson DB. Developing high-quality multiple-choice questions to assess application of knowledge using patient vignettes. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Holtzman KZ, Swanson DB. Developing high-quality multiple-choice tests to assess application of knowledge using patient vignettes. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Holtzman KZ, Swanson DB. Writing case-based MCQs for basic science courses. International Association of Medical Science Educators; July 2010. New Orleans, LA.
    JWM Chow, Swanson DB, Holtzman KZ, Nelson MV, Langer MM. Progress on progress testing at St. George’s University of London: initial experience of the UK Multi-School Progress Testing Project. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Kahraman N. Within-item multidimensionality in unidimensional tests. National Council on Measurement in Education Annual Meeting; May 2010. Denver, CO.
    Kahraman N, Cuddy MM, Harik P, Clauser BE. Growth curve modeling for response times. Psychometric Society; July 2010. Athens, GA.
    Katsufrakis PJ. Designing a model physician reentry system. AMA Physician Reentry Conference; May 2010. Chicago, IL.
    Katsufrakis PJ. Licensing and maintenance of skills: where do quality and safety fit in? AMSA Patient Safety and Quality Leadership Institute; January 2010. Philadelphia, PA.
    Katsufrakis PJ. Validity concerns and professionalism assessment. Innovations in Medical Education; March 2010. Pasadena, CA.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Association of American Medical Colleges Annual Meeting; November 2010. Washington, DC.
    King AM, Margolis MJ, Pohl H, Wagner DP. Developing test specifications for valid assessments. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Mathes B, Holtzman KZ, Swanson DB, Ling Y, Johnson M. Use of dermatologic images on Step 1 and Step 2 Clinical Knowledge (CK): a controlled trial of the impact on item characteristics. Association of American Medical Colleges Annual Meeting; November 2010. Washington, DC.
    Patterson JA, Revest PA, Swanson DB, Holtzman KZ, Nelson MV, Langer MM. Progress on progress testing: Barts and the London’s initial experience of the UK Multi-School Progress Testing Project. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Raymond MR, Harik P, Clauser BE. The impact of statistically adjusting for rater effects on conditional standard errors for performance ratings. National Council on Measurement in Education Annual Meeting; May 2010. Denver, CO.
    Raymond MR, Luciw-Dubas UA. The second time around: an observation on retest effects for oral examinations. American Educational Research Association Annual Meeting; April 2010. Denver, CO.
    Richmond M. Multi-source feedback: how can we use it to improve? E*Value User Conference; October 2010. Minneapolis, MN.
    Rudd R, Kirsch I, Roter D, Pisano S, King AM. Health literacy: measuring the other side of the coin. Health Literacy Annual Research Conference; October 2010. Bethesda, MD.
    Swanson DB. Essential Skills in Medical Education Assessment Course. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Fundamentals of Assessment in Medical Education (FAME) Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Fundamentals of Assessment in Medical Education (FAME) Association of American Medical Colleges; November 2010. Washington, DC.
    Swanson DB. Is collaboration the future for progress testing? Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Swanson DB. Use of generalizability theory in designing and analyzing performance-based tests. Ottawa Conference on the Assessment of Competence in Medicine and the Healthcare Professions; May 2010. Miami, FL.
    Swanson DB, Holtzman KZ. Assessing the hard stuff with innovative items. Association of Test Publishers Annual Meeting; February 2010. Orlando, FL.
    Swygert KA, Balog K, Furman GE, Van Zanten M. Performance of examinees on the USMLE® Step 2 Clinical Skills (CS) Communication and Interpersonal Skills (CIS) subscales. American Educational Research Association Annual Meeting; April 2010. Denver, CO.
    Swygert KA, Jobe AC. Measurement of clinical skills: advanced topics. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Swygert KA, Muller E, Cuddy MM, Swanson DB, Scott C, Van Zanten M. The relationships between examinee use of time and global ratings on the USMLE Step 2 CS Examination. Association for Medical Education in Europe; August 2010. Edinburgh, Scotland.
    Swygert KA, Muller ES, Scott CL, Swanson DB. The relationship between USMLE® Step 2 Clinical Skills (CS) patient note ratings and time spent on the note—do examinees who spend more time write better notes? Research in Medical Education; November 2010. Washington, DC.
    Swygert KA, Muller ES, Swanson DB, Scott CL, Van Zanten M. The relationships between examinee use of time and global ratings on the USMLE® Step 2 CS examination. Association for Medical Education in Europe; September 2010. Glasgow, Scotland.
    Wainer H. On the crucial role of empathy in the design of score reports. ETS Invitational Conference on Score Reporting; November 2010. Princeton, NJ.
    Azari ED, Brinckerhoff L, Carson JD, Farmer C, Mellichamp F. Test accommodations under the 2008 amendments to the ADA. Council of Licensure Enforcement and Regulation; September 2009. Denver, CO.
    Baldwin SG, Rebbecchi TA. Assessing written communication skills: effect of training and scoring tools on reliability and validity. Research in Medical Education; November 2009. Boston, MA.
    Butler A. Overview of the flexible blueprinting system and customized assessments. Group for Research in Pathology Education (GRIPE); January 2009. Columbia, SC.
    Butler A, Swanson DB, Holtzman KZ, Langer MM, Nelson MV. Blending progress tests with clerkship finals: web-based test batteries. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Clauser BE, Margolis MJ, Mee JM. An experimental study of the use of performance data by judges in an Angoff standard-setting exercise. Psychometric Society; July 2009. Cambridge, England.
    Cromley JG, Snyder LE, Luciw-Dubas U, Dai T. A large-scale test of the DIME model of reading comprehension with domain-specific text. 2009 Annual Meeting of the Society for the Scientific Study of Reading; June 2009. Boston, MA.
    De Champlain AF. Scoring and reporting issues with Clinical Science Examinations and workplace assessments. Association of American Medical Colleges Annual Meeting; November 2009. Boston, MA.
    Dillon GF, Swanson DB. Catching the wave: prevention education and the USMLE. Association for Prevention Teaching and Research; February 2009. Los Angeles, CA.
    Farmer C, Hosterman J, Walter LC. Case studies: interagency perspectives from the front lines pre- and post-ADA. Testing Agencies Disability Forum; November 2009. Princeton, NJ.
    Furman GE, Scott CL. Transitioning from trainer to administrator/director: “empowering your inner leader”. Association of Standardized Patient Educators; June 2009. Las Vegas, NV.
    Furman GE, Smee S, Scott CL, Wojcik J. Improving your role playing and feedback skills: lessons from the trenches of high-stakes licensure examinations. Association of Standardized Patient Educators; June 2009. Las Vegas, NV.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Association of American Medical Colleges Annual Meeting; November 2009. Boston, MA.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Association for Medical Education in Europe Annual Meeting; September 2009. Malaga, Spain.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Accreditation Council for Continuing Medical Education; September 2009. Chicago, IL.
    King AM, Boulet JR, Errichetti TA, Pohl H. Developing case material for performance-based assessments. Association of Standardized Patient Educators; June 2009. Las Vegas, NV.
    King AM, Hoppe R. Team communication skills: novice to expert learners. International Conference on Communication in Healthcare; October 2009. Miami Beach, FL.
    King AM, Margolis MJ, Pohl H, Wagner D. Test construction. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Langer MM. A reexamination of Lord’s Wald test for differential item functioning detection using item response theory and modern error estimation. Psychometric Society; July 2009. Cambridge, England.
    Langer MM. Test equating: designs, methods, and applications to progress testing. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Langer MM, Swygert KA, Boulet JR. Using survival data analysis to predict success on the USMLE Step 2 Clinical Skills (CS) examination. Research in Medical Education; November 2009. Boston, MA.
    Margolis MJ. Skills assessment: advances and challenges. American Educational Research Association Annual Meeting; April 2009. San Diego, CA.
    Mee JM, Mazor KM, Holtman MC. Who knows what I do? Medical students' and residents' views of training. Research in Medical Education; November 2009. Boston, MA.
    Nungester RJ, Clauser BE, Mills C, Fabrey L, Case SM, Brennan R, Plake B, Sireci S. Responding to errors in high-stakes assessments. American Educational Research Association Annual Meeting; April 2009. San Diego, CA.
    Nungester RJ, McNulty M, Buchman K, Riley R, Anderson L. Advanced test construction and delivery techniques: enhancing security and integrity. Virtual patient simulations. National Organization for Competency Assurance; November 2009. Phoenix, AZ.
    Raymond MR, Clauser BE, Furman GE. Reducing measurement error in ratings of communication and interpersonal skills through statistical adjustment. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Rebbecchi TA, Boulet JR. Assessing the written communication skills of medical school graduates. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Scoles PV. Exam format. American Association of Osteopathic Examiners; January 2009. New Orleans, LA.
    Scott CL, Furman GE, Jobe AC. Teaching or assessment? Adapting standardized patient cases for either use. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Swanson DB, Holtzman KZ. Writing case-based MCQs for basic science courses. International Association of Medical Science Educators; June 2009. Leiden, Netherlands.
    Swanson DB, Holtzman KZ, Butler A, Langer MM, Nelson MV. Collaboration across the pond: the multi-school progress testing project. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Swanson DB, Holtzman KZ, Johnson M, Ling Y, Ouyang W, Bucak SD, Haist SA. Retention of basic science information by senior medical students. International Association of Medical Science Educators; June 2009. Leiden, Netherlands.
    Swygert KA. Measurement of clinical skills: rules, tips, guidelines, and pitfalls. Association for Medical Education in Europe; August 2009. Malaga, Spain.
    Swygert KA. The usefulness of Hierarchical Linear Modeling (HLM) in the assessment of how examinees use time on innovative high-stakes standardized examinations. University of Maryland, Department of Human Development and Quantitative Methodology; March 2009. College Park, MD.
    Swygert KA, Balog K, Furman GE, van Zanten M. Performance of examinees on the USMLE Step 2 Clinical Skills (CS) Communication and Interpersonal Skills (CIS) subscales. Research in Medical Education; November 2009. Boston, MA.
    Swygert KA, Balog K, Jobe A. The impact of repeat information on examinee performance for a large-scale standardized patient examination. National Council on Measurement in Education; April 2009. San Diego, CA.
    Swygert KA, Swanson DB, Scott CL, Muller ES. An assessment of encounter timing in a high-stakes standardized patient-based examination. American Educational Research Association Annual Meeting; April 2009. San Diego, CA.
    van Zanten M, Boulet JR, Swygert KA. Evaluating the spoken English proficiency of international medical graduates in a performance-based clinical skills examination. American Educational Research Association Annual Meeting; April 2009. San Diego, CA.
    Wainer H. Pictures at an exhibition: the role of visual displays in an evidence-based science. The Samuel J. Messick Career Achievement Award Invited Lecture. American Psychological Association; August 2009. Toronto, Canada.
    Blackmore D, Danoff D, Jobe AC, Lee R. Reshaping standardized high-stakes national examinations at the request of the special needs examinee: current practices. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Butler A, Holtzman KZ, Ross LP, Swanson DB. NBME assessments for medical schools. Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    Clauser BE, Harik P, Margolis MJ, Mee JM, Rebbecchi T, Swygert KA. The generalizability of documentation scores from the USMLE Step 2 Clinical Skills Examination. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Cromley JG, Snyder LE, Luciw UA. Testing the fit of the DIME model of reading comprehension with biology text. American Educational Research Association Annual Meeting; March 2008. New York, NY.
    Cuddy MM, De Champlain AF. The impact of standardized patient and examinee ethnicity on high-stakes clinical skills examination performance. Association for Medical Education in Europe; September 2008. Prague, Czech Republic.
    Cuddy MM, De Champlain AF. Do standardized patient and examinee ethnicity impact performance on the USMLE Step 2 Clinical Skills Examination? Implications for large-scale, high-stakes performance-based assessments of clinical skills. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Cuddy MM, Swanson DB, Clauser BE. A multilevel analysis of examinee gender and USMLE Step 1 performance. Association of American Medical Colleges Annual Meeting; 2008. San Antonio, TX.
    De Champlain AF, Gessaroli ME. Fitting item response theory (IRT) models and interpreting output with your data set (Part 2). Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    De Champlain AF, Nungester RJ. An overview of common item response theory (IRT) models and related concepts (Part 1). Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    De Champlain AF, Nungester RJ. An introduction to item response theory: overview of common models and applications to medical education assessment issues. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Dillon GF, Melnick DE. The value and price of performance assessment. International Conference on Medical Regulation; October 2008. Cape Town, South Africa.
    Furman GE. The use of standardized patients in a national licensure examination. International Conference on Medical Licensing and OSCE; February 2008. Kaohsiung, Taiwan.
    Furman GE. Case development. International Conference on Medical Licensing and OSCE; February 2008. Kaohsiung, Taiwan.
    Furman GE, Brown C, Scott CL, Swygert KA, Jobe AC. Assessment for life: what traits contribute to making a good Standardized Patient? Results of a personality inventory for SPs working in a high-stakes examination. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Galbraith RM. Assessment at work. Northeastern Group on Educational Affairs; April 2008. Burlington, VT.
    Galbraith RM. Approaches to program evaluation. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Gessaroli ME. Implications of finding multidimensional information in unidimensional data. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Gessaroli ME, De Champlain AF. The anatomy of a test score: partitioning information using factor analysis. Ozzawa 2008 — Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Grabovsky I, Subhiyah R, Swygert KA, Balog K. Comparing the effectiveness of two models for equating a large-scale standardized performance assessment. American Educational Research Association Annual Meeting; March 2008. New York, NY.
    Hawkins RE. Assessment for reentry to clinical practice. Physician Reentry into the Workforce Conference; September 2008. Elk Grove, IL.
    Hawkins RE, Ciccone AL, Mee JM, Tallia AF, Hamm B. Doctors’ perceptions regarding their colleagues’ clinical competence. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Hawkins RE, Ciccone AL, Tallia AF, Hamm B. Doctors’ prescribing practices and communication with pharmacists: opportunities for professional development. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Holtman MC, Anderson MB. Collaborative interprofessional professionalism: a work in progress. Ozzawa 2008 — Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Holtman MC, Farrell ML, Canavan CT, Gottesfeld N, Meem JM, Mazor KM, Margolis MJ, Clauser BE, Hawkins RE. Developing a research agenda for multisource feedback to assess professional behaviors. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Holtzman KZ, Swanson DB. Developing high-quality multiple-choice questions to assess application of knowledge using patient vignettes. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Holtzman KZ, Swanson DB. Developing high-quality single-best-answer questions using patient vignettes to assess application of basic science knowledge. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Holtzman KZ, Swanson DB, Albee K. Option selection by content experts vs statistics: how much does it matter? Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Jobe AC. Case development. International Conference on Medical Licensing and OSCE; February 2008. Kaohsiung, Taiwan.
    Jobe AC. Student assessment: knowledge, skills, behaviors and tools to assess — implications for curriculum. New and Developing Medical Schools Invitational Conference; July 2008. St. Louis, MO.
    Kahraman N. An examination of the impact of weighting on components of a high-stakes performance examination. National Council on Measurement in Education Annual Meeting; April 2008. New York, NY.
    Kahraman N, Clauser BE, Margolis MJ. An examination of the impact of subtask weighting on components of a high-stakes performance examination. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Kahraman N, De Champlain AF, Gessaroli ME. Assessing the underlying structure of communication and data gathering skills on a sample of USMLE Step 2 Clinical Skills cases using confirmatory factor analysis. Psychometric Society; June 2008. Durham, NH.
    Katsufrakis PJ. The assessment of professional behaviors program: a developmental assessment program of the National Board of Medical Examiners. AAMC Northeast Group on Student Affairs; April 2008. Baltimore, MD.
    Katsufrakis PJ. Teaching professionalism using effective, valid, and reliable educational interventions. Society of Teachers of Family Medicine; May 2008. Baltimore, MD.
    Katsufrakis PJ. Improving raters' assessments of learners: using the NBME's Assessment of Professional Behaviors (APB) survey instrument and APB rater training workshop as a model to explore issues in rater training. Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    Katsufrakis PJ. Assessing professional behavior in residency. Association for the Behavioral Sciences and Medical Education; October 2008. San Diego, CA.
    Katsufrakis PJ. The assessment of professional behaviors program: a developmental assessment program of the National Board of Medical Examiners. AAMC Central and Southern Groups on Student Affairs; April 2008. Destin, FL.
    Katsufrakis PJ, Hawkins RE. Rater training to support multisource feedback in graduate medical education. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Katsufrakis PJ, Hawkins RE, Holmboe ES, Grabovsky I. Improving raters' assessment of learners. Association of American Medical Colleges; November 2008. San Antonio, TX.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    King AM, Boulet JR. Fundamentals of Assessment in Medical Education (FAME). Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    King AM, Hoppe R. Developing standardized patient tests. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    King AM, Pohl H, GammonW. Developing material for performance based assessments. Association of Standardized Patient Educators; June 2008. San Antonio, TX.
    Luciw UA, Cromley JG, Snyder LE. Reading comprehension and component processes in biology: comparisons across monolingual and bilingual English speakers. American Educational Research Association Annual Meeting; March 2008. New York, NY.
    Margolis MJ, Clauser BE. Validity theory for assessment in medical education. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Margolis MJ, Clauser BE, Harik P, Murray C. An investigation of the impact of patient characteristics on USMLE Step 3 Examination performance. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Mee JM, Baldwin S, Margolis MJ, Clauser BE. An experimental analysis of the use of examinee performance data in an Angoff standard-setting exercise. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Melnick D. Professionalism in the 21st century. Columbia University Daniel Morris Symposium; September 2008. New York, NY.
    Melnick DE. Evaluating graduates of foreign medical schools (GoFMS) for licensure in the U.S. International Conference on Medical Regulation; October 2008. Cape Town, South Africa.
    Melnick DE, Pannizzo L, Scoles PV. A healthcare workforce assessment organization for Europe. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Morrison C, Swanson DB, Dillon GF, Kiepert M, Sample L. Factors influencing response times on USMLE Steps 1, 2 Clinical Knowledge and 3. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March, 2008; March 2008. Melbourne, Australia.
    Nungester RJ. An examination of the impact of weighting on components of a high-stakes performance examination. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Nungester RJ, De Champlain AF, Gessaroli ME, Siddiquim ZA. An introduction to item response theory. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Philibert I, Galbraith RM. Program assessment to judge the effect of changes and to make improvements. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Ramineni C, Clauser BE, Harik P, Swanson DB. Contrast effects in standardized patient ratings of the USMLE Step 2 Clinical Skills examination. National Council on Measurement in Education; March 2008. New York, NY.
    Ross LP, Swanson DB, Orr NA. USMLE Step 1 and Step 2 as predictors of performance on the Dermatology Certifying Examination. Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    Sawhill AJ, Swanson DB, Morrison C, Holtzman KZ, Orr NA, DeRosa GP, Giordano C. Predicting performance on Part l of the American Board of Orthopaedic Surgery Certifying Examination using United States Medical Licensing Examination Step 1, Step 2 Clinical Knowledge (CK) and Step 3 scores. Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    Scoles PV. Changes to the USMLE – what is the Gateway exam? International Association of Medical Science Educators; July 2008. Salt Lake City, UT.
    Scoles PV. Prioritize elements for research or program/policy development-institutional culture. Initiative to Transform Medical Education; December 2008. Chicago, IL.
    Scoles PV. The Gateway exam - pros and cons. Association of Pathology Chairs; July 2008. Colorado Springs, CO.
    Scott CL, Furman GE, Jobe AC. Teaching or assessment? Adapting standardized patient cases for either use. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Snyder LE, Cromley JG, Luciw UA. Motivation for Biology Reading and for Biology Courses. American Educational Research Association Annual Meeting; April 2008. New York, NY.
    Southgate L, Hazlett C, Lee R, Butler A, Norcini JJ. Novel assessments and services for medical schools, students and professionals: a perspective from several organizations. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swanson DB. A framework for thinking about the reliability and validity of commonly used assessment methods. Association of American Medical Colleges Annual Meeting; November 2008. San Antonio, TX.
    Swanson DB, Butler A, Holtzman KZ. Collaborative study of a multi-school progress testing program. Association for the Study of Medical Education; September 2008. Leichester, England.
    Swanson DB, Clauser BE. Use of generalizability theory in designing and analyzing performance-based tests. Association for Medical Education in Europe; August 2008. Prague, Czech Republic.
    Swanson DB, Clauser BE. Use of generalizability theory in designing and analyzing performance-based tests. Ozzawa 2008 – Assessment for life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swanson DB, De Champlain AF, Butler A. An overview of NBME clinical progress testing and basic science cumulative achievement testing pilot efforts. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swanson DB, Holtzman KZ. Developing high-quality multiple-choice questions to assess application of knowledge using patient vignettes and multimedia. International Association of Medical Science Educators; July 2008. Salt Lake City, UT.
    Swanson DB, Holtzman KZ, Albee K, Clauser BE. Measurement characteristics of content-parallel single- best-answer and extended-matching questions in relation to number and source of options. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swygert KA, De Champlain AF, King AM, Sanger JM, Scott CL. Predicting case difficulty with Standard Patient and case characteristics on the USMLE Step 2 Clinical Skills Examination. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swygert KA, McKinley DW, Scott CL, Boulet JR, Swanson DB. How Step 2 Clinical Skills examinees use their time in the patient encounter. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Swygert KA, Rebbecchi T, Jobe AC. Statistical procedures for assessing the impact of training physician note raters. Ozzawa 2008 – Assessment for Life: 13th Ottawa International Conference on Clinical Competence; March 2008. Melbourne, Australia.
    Turner RD. Working with designers and contractors: the owner's perspective. Design on the Delaware Regional Conference; November 2008. Philadelphia, PA.
    Wainer H. Using the SAT as an educational indicator: a successful application. Looking Back: a Conference in Honor of Paul W. Holland. Educational Testing Service; September 2008. Princeton, NJ.
    Wainer H. Four conversations about three things. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Wainer H. Fourteen conversations about three things. National Council on Measurement in Education Annual Meeting; March 2008. New York, NY.
    Wainer H. Schrodinger's cat, Rasch's P and the most dangerous equation. National Council on Measurement in Education; March 2008. New York, NY.

     

    NBME logo