NBME logo
  • Home
  • Programs and Services
  • About the NBME
  • Research
  • Publications
  • Contact NBME
  • Center for Innovation
  • Stemmler Medical Education Research Fund
    • Online Application Center
    • Current Grant Information
    • Prior Grant Information

Stemmler Fund Prior Grant Information

In addition to viewing individual grant abstracts below, you can also click here to open or close all abstracts on this page.

This is the listing of ALL prior grant informaton. To see information for just the last three years, click here.

2006–2007 Grantees

McMaster University, Hamilton, Ontario

Principal Investigator: Kelly Dore Banks, PhD (ABD)
Grant Amount / Duration: $145,870 / 2 years
Project Title: The evaluation of the reliability, validity, feasibility, and acceptability of a web-based instrument to measure professional qualities in medical school applicants
[–] Click HERE to close Abstract

Rationale: Health professions’ admission committees across the world are faced with the difficult task of selecting, from among many eligible applicants, the select few who will be admitted to their training programs. The determinants of this admissions process are often a combination of cognitive measures, such as Grade Point Average (GPA) or standardized tests such as the Medical College Admission Test (MCAT) and non-cognitive measures, including interviews and essays. However, there has been limited success in the development of evaluation tools that will provide reliable and valid measures of an applicant’s non-cognitive qualities. The exception to this is the MMI, in essence an admissions OSCE. The MMI has been shown to predict intramural and licensing examination performance. However, like any OSCE the MMI has practical limitations; the sheer volume of candidates for many institutions makes it necessary to develop a reliable and valid strategy for screening candidates’ non-cognitive attributes in a more efficient fashion. To this end a new measure, using video scenarios and written or audio responses was developed and a pilot study was completed. In 2006, 110 applicants to McMaster’s medical school completed this Computer-based Multiple Sample valuation of Non-cognitive Skills (CMSENS). Of those applicants, 78 completed the CMSENS by verbally recording their responses in an audio file while 32 typed their responses. The overall test generalizability was .86 for the audio CMSENS, and .72 for the written. The written CMSENS also demonstrated predictive validity, correlating with the MMI at .51. However, conclusions from this study are limited because of the small sample and the one-time nature of the findings.

Objectives:

  1. Assess the impact of varied test time and proctoring on the reliability and validity
  2. Determine the predictive validity of CMSENS
  3. Determine the reliability of the pilot results with a larger, more diverse sample

Methods: To achieve the first objective, applicants to McMaster’s medical school will be invited to participate in the CMSENS in winter 2008. These applicants will complete a CMSENS in which the length of response time and proctoring will be manipulated. In addition to reliability and validity analysis to determine the optimal testing format, participants invited to interview will have their scores compared to MMI performance. To assess objective 2, several methods will be followed. In-program performance results will be assessed for the about 25 members of the medical class of 2009 who participated in the CMSENS pilot project and gained admission. In addition, construct validity will be examined by recruiting students in their final year of medical school and second year residents to participate in a mock-CMSENS, thereby allowing a comparison between (a) scores assigned to the medical school applicants and more senior trainees, and (b) for the final year students and residents, comparison between scores assigned on the CMSENS and those received on the Canadian qualifying examinations; specifically Part II of the MCCQE ( an OSCE) which evaluates both cognitive and non-cognitive characteristics of medical trainees. To satisfy the 3rd objective, a sample of applicants applying in the winter 2009, will be administered the CMSENS. A larger sample in this year will permit accurate assessment of reliability and facilitate comparison of CMSENS performance to MMI & in-program results.

Significance for Medical Education and Practice: This innovative assessment tool, if it is proven reliable and valid, has the potential to allow educators in the health professions to efficiently assess the non-cognitive qualities of the thousands of applicants to training programs, for whom reliable and valid was previously impossible.

Southern Illinois University School of Medicine, Springfield, Illinois

Principal Investigator: Dr. Richard Rosher
Grant Amount / Duration: $150,000 / 2 years
Project Title: An Objective Measure to Assess Resident Competency in Systems-Based Practice
[–] Click HERE to close Abstract

Project objectives: The ACGME has directed that all residents must meet six competencies. The sixth competency, Systems-Based Practice, has presented a challenge for assessment.

The objective of this project is: to develop a standardized, objective, innovative method to measure the sixth competency: Systems Based Practice.

An OSSIE (Objective Structured System Interaction Examination) will evaluate the resident’s ability to interact with the health care team, deal with aspects of the health care system, coordinate effective care across settings, and provide cost-effective care.

Rationale and primary methods to be employed: In today’s health care system, not only must physicians be competent in their knowledge and practice of medical care, but they must be leaders of teams composed of other health care providers. They must be able to assist their patients to navigate the health care system and insure that continuity of care is promoted across health care settings. They must be cognizant of costs of various treatments.

Four required skills are identified by the ACGME that must be attained in order to be judged competent in Systems-Based Practice. The resident must:

  1. be able to understand the interaction of physician practices with the larger system, resources, and providers.
  2. have knowledge of practice and delivery systems.
  3. practice cost-effective care.
  4. advocate for patients within the health care system.

Three scenarios involving patients, families, and members of the health care team will be developed to test each of these four required skills for a total of twelve scenarios. The scenarios will be presented in the format of an OSCE. The new examination for residents, the OSSIE, will be given to PGY2 residents in the middle of their second year. It will be a formative examination that will evaluate the residents’ competence and enable tailoring of their third year to improve these competencies. Generalizability analysis will be used to determine inter-case reliability. Correlations between exam scenarios and ratings by observers in practice situations will determine the validity of the OSSIE.

How the proposed research will advance assessment in resident education: Currently, there are few measures of Systems-Based Practice to use in assessing residents. The proposed research will advance assessment in resident education by investigating an innovative, objective method of assessing the ACGME competency of Systems Based Practice. This new method of assessment, the OSSIE, will be a modification of the traditional OSCE. A simulation of interaction with other members of the health care team will capture abilities needed for physicians in today’s complex health care system. By using the OSSIE, faculty will be able to provide constructive feedback to each resident to enable improvement in Systems-Based Practice.

University of Pennsylvania, Philadelphia, PA

Principal Investigator: David Asch, MD
Grant Amount / Duration: $149,820.55 / 1 year
Project Title: Clinical Outcome-Based Assessment of Medical Education: Concept and Evaluation
[–] Click HERE to close Abstract

The overall goal of this research project is to demonstrate the feasibility and examine the usefulnessof evaluating the quality of clinical training programs by assessing the clinical outcomes of the patients later cared for by the graduates of those training programs. The concept is premised on the view that although medical education serves a collection of intermediate goals, in the end the most important clinical goal is to improve the health of individuals and populations. We may mean many different things when we say that a medical school or a residency program is good, or that one medical school or residency program is better than another. However, stakeholders, including prospective trainees, health systems, and patients, could be justified in expecting at least one specific meaning: that graduates of good training programs in general take care of patients well, and that graduates of better training programs in general take care of patients better.

This project represents a study of “proof of concept,” using as our test case the analysis of maternal birth treatment and outcomes to inform the assessment of residency training in obstetrics and gynecology. We will use data from all hospital-based deliveries in New York and Florida between 1992 and 2006 to test the relationship between residency program, physician characteristics, and maternal outcomes. Our measures of performance will be:

a) use of Caesarean section; (b) whether a woman who delivered vaginally had a 4th degree perineal laceration; (c) whether a woman experienced any adverse outcome, as defined by HealthGrades(2004); and (d) a complication measure developed by Epstein, Ketcham and Nicholson (2006) that assigns larger positive values to complications that result in long hospital stays for the mother. Using these data and measures, we will address the following specific questions:

1) How much variation in inter-physician performance is explained by residency program and year of residency graduation?

2) Can residency programs be categorized reliably according to the treatment patterns and patient outcomes of their graduates?

3) Is there systematic variation in residency program effects?

We believe these notions are consistent with the goals of the Stemmler Fund and the National Board of Medical Examiners more generally, because they incorporate new methods of assessment in which the influence of training programs on clinical outcomes is assessed independently of patient and physicians characteristics. At the conclusion of this project, we expect to have a deeper and more specific understanding of the promise and limitations of evaluating medical training programs using clinical outcomes; we expect to have a series of manuscripts describing our conceptual view, analytic approach, and results; and we expect to have identified the next steps toward further development and evaluation of this assessment concept.

2005–2006 Grantees

University of California, San Francisco

Principal Investigator: Karen E. Hauer, MD
Grant Amount / Duration: $149,167 / 2 years
Project Title: Cultural Competence Using Shared Decision Making
[–] Click HERE to close Abstract

Objectives
We propose to assess the reliability and validity of a shared decision making checklist as a tool for evaluating medical student cultural competence in standardized patient encounters. We will validate shared decision making checklist ratings by correlating scores with global assessments of cultural competence made by clinician experts in cultural competence. Additionally, we will compare ratings of cultural competence to ratings of general communication skills as measured by the Common Ground instrument in these standardized patient encounters to determine the extent to which cultural competence overlaps with communication skills proficiency.

Background
Failure to develop care plans that incorporate information about patients’ cultural backgrounds and values contributes to important disparities in health care. Medical schools and residency training programs are now required to teach and assess trainees’ cultural competence skills. However, a review of the literature indicates that measures of cultural competency suffer from significant deficits. Most studies rely on measurements of attitudes or skill selfassessments, and the minority of studies that do have skills-based outcomes assess only a subset of relevant competencies. Thus, while medical schools nationally are emphasizing the importance of patient-centered care and cultural competence, they lack the ability to measure the degree to which students are mastering these core concepts.

Methods
We will determine the reliability and validity of a shared decision making checklist as a measure of cultural competence. Using purposeful sampling to identify students of different genders, clinical skills competence, and races, we will select 200 videotaped encounters from 50 third year medical students’ interactions with four standardized patients. Trained coders will score the student-standardized patient encounters using a shared decision making checklist. Two faculty cultural competence experts who will be blinded to the study hypothesis, checklist content, and scores will perform global assessments of cultural competence based on review of the same videotaped encounters. The standardized patients’ ratings of general communication skills using the Common Ground instrument will be used to assess trainees’ communication skills. Reliability of the three ratings instruments will be calculated. We will assess validity by correlating the shared decision making scores to the global assessments of cultural competence. We will assess concurrent validity by correlating the shared decision making results with the communication skills scores. We will explore reliability by conducting several generalizability studies. This analysis will determine the number of raters and cases needed to obtain reliable cultural competence scores.

Implications for assessment
Our results will inform the assessment literature by evaluating the use of a shared decision making checklist for assessing cultural competence and determining the degree to which cultural competence correlates with communication skills. These results will facilitate evaluation of the efficacy of cultural competence curricula.

University of Missouri-Columbia School of Medicine

Principal Investigator: Kimberly G. Hoffman, PhD
Grant Amount / Duration: $150,000 / 2 years
Project Title: Use of Portfolios to Assess Medical Student Outcomes
[–] Click HERE to close Abstract

The public in general and professional organizations in particular are increasingly demanding evidence of competence in medical practice and a physician’s ability to meet the demands of today’s society (IOM, 2001; 2003). Medical education has responded with a focus on educational outcomes (Whitcomb, 2004), case-based, authentic, curricula (Friedman, 2001; Kincade, 2005) and experiences that support the development of physicians within a complex health care system (ACGME, 2005; AAMC Report V, 2001). The emerging definition of professional competence is difficult to evaluate using traditional assessment. The portfolio addresses the current limitations of assessment by integrating professional outcomes and placing them within an authentic learning context. Challenges in portfolio assessment include insufficient inter-rater reliabilities, questions of generalizability, a substantial faculty and learner time commitment, and balancing a prescriptive, standardized approach with individualization (Friedman, et. al. 2001; Case, 1994; Des Marchais et al 1995; Challis 1999: LeMahieu, et.al 1993; Herman et al, 1995).

The University of Missouri has developed a set of key competencies for our graduates (MU2020 key characteristics) that are consistent with national and international discussions of professional competence. To our knowledge few medical schools have successfully engaged faculty in developing an approach for assessment of professional competencies. This proposed research draws on the prior work at MU to address two global questions: 1) How does the development of a set of descriptive anchors for each of the key characteristics influence the validity, reliability, reproducibility and trustworthiness of portfolio assessment? 2) How do student contributions to the portfolio influence faculty assessment of portfolios?

Descriptive anchors of exemplary performance for each of the professional outcomes will be derived from the literature, clinical faculty, medical students and patients. These anchors will be used to develop a portfolio assessment tool. A twostep judgmental review process will establish the content validity of the descriptive anchors. Inter and intra rater reproducibility will be established by using the assessment tool to evaluate the portfolios of third year medical students. Predictive validity will be determined by correlation of portfolio assessment with traditional measures of student success. The influence of student contributions to the portfolio assessment will be evaluated by determining the differences between individual faculty ratings of students portfolios rated with only required documentation and rated a second time with student contributions included. An external advisory board will provide guidance to the research team and will review the appropriateness of the intermediate research projects.

The outcome of this project will be a portfolio assessment tool to evaluate student outcomes. It will be a useful addition in the assessment of learners and promote an enhanced understanding of professional competence.

Columbia University - College of Physicians and Surgeons

Principal Investigator: Peter C. Wyer, MD
Grant Amount / Duration: $149,975 / 2 years
Project Title: Designing Cognitive Measures of Practice-Based Learning and Improvement as an Iterative Process Combining Rasch and Classical Measurement Methods
[–] Click HERE to close Abstract

Currently, no psychometrically rigorous and developmentally informative instrument exists for assessing resident competencies in the cognitive domains encompassed by Practice Based Learning and Improvement (PBLI), as defined by the ACGME. Using an iterative process model, we propose to develop and empirically validate four cognitive measures tapped by a comprehensive PBLI instrument that permits periodic formative and summative assessments of residents competency levels as they progress through their programs in different specialties. Based upon preliminary experience with a relevant pilot project, we will develop an item pool addressing the following PBLI sub-competency domains: 1) analyzing practice experience, 2) using information technology to manage information and locate evidence from scientific studies related to patients’ health problems, 3) applying knowledge of study designs and statistical methods to the appraisal of clinical studies and other information on diagnostic and therapeutic effectiveness, and 4) assimilate evidence from scientific studies related to patient’s health problems. We propose that these domains, which conform to the standard cognitive domains of evidence based medicine (EBM), and are frequently summarized as ‘ask, acquire, appraise, and apply’, are best-suited for measuring the cognitive aspects of PBLI.

Initially, we will generate a pool of 100-150 Written-Structured Response items tied to sub-competency domain specifications, using different item formats (such as multiple choice, true-false items). The item pool will represent the relevant, observable facets of each of the sub-competency domains. Parallel forms of the PBLI instrument will be generated next, aligned with common assessment specifications that stipulate a weighting distribution for items tied to different competency domains and cognitive levels (Phases 1-2, Year 1). We will implement a rigorous content- and empirical- validation plan by testing each parallel form of the PBLI on samples of resident volunteers in medicine, pediatrics, and emergency medicine at New York Presbyterian Hospital, as well as upon residents in accredited programs from these specialties outside of our institution. We will use Rasch modeling techniques combined with methods from classical measurement theory to examine validity and reliability of the PBLI measures through these empirical trials. We will supplement the PBLI data with a structured survey to identify programs conforming to ‘best practices’ criteria in the target domains of EBM. Convergent validity evidence can thus be gathered and evaluated, along with evidence of resident group differences on PBLI measures in programs that are more or less compliant with EBM practices and teaching.

We believe that the resulting PBLI instrument(s) will provide a unique and critically important vehicle that, combined with existing performance-based assessment modalities, will make possible a comprehensive approach to evaluation of residents’competencies across a broad range of specialties. We believe that the PBLI instruments thus produced will fill a gap in the area of outcome assessments in residency programs, the absence of which currently limits the quality.

2004–2005 Grantees

Regular Program

Eastern Virginia Medical School

Principal Investigator: Thomas Hubbard, MD, JD, MPH
Grant Amount / Duration: $70,000 / 2 years
Project Title: The Augmented Standardized Patient: Using Augmented Reality for Assessment
[–] Click HERE to close Abstract

Standardized patients (SPs) are widely used to teach and assess clinical skills. Normal SPs, however, are limited in their ability to display abnormal physical findings. Non-SP simulations could be used (e.g., listening to pre-recorded abnormal heart sounds on a computer), but that method excludes interaction with a live person, and thus is less realistic and probably a less accurate representation of students' skills in real settings. Augmented reality (AR) can expand what an SP can do. AR is a methodology that overlays artificial or virtual components (visual, aural, etc.) over the natural environment to provide the user with helpful information. The proposed augmented SPs (ASPs) will combine the assessment technologies of SPs and computer-driven simulations, allowing each to offset limitations of the other. This project augments the SP by permitting the learner to hear abnormal heart and lung sounds from an SP whose own sounds are actually normal.

We have developed a functioning prototype of the technology for this augmentation. The prototype allows the listener to hear pre-recorded heart and lung sounds when auscultating any of 26 locations on a mannequin. Prior to the award date, we will have moved the system from the mannequin to a variety of SPs of different body morphologies, with a learner hearing the selected sounds rather than those of the SP through a modified stethoscope.

The primary objective of the proposed project is to continue to make this system more realistic by minimizing the cues that AR is being used. We will improve the stethoscope's appearance and performance, make the sounds audible over a wider variety of locations on the ASP, and create a database of abnormal heart and lung sounds. These changes will move this new assessment technology from a laboratory prototype to a functional system for routine assessment of students' auscultation skills in any SP-based examination.

We will test the improved system with students in a required annual M4 OSCE. Product development needs will drive formative evaluation studies of students using the system in OSCE-like assessments throughout the project period. Through surveys and interviews we will gather students' views on certain aspects of the ASP. We will also examine the validity of using the ASP through analyses of students' performance in several situations (e.g., ASP with normal findings versus traditional SP with normal findings; ASP with normal findings versus "placebo" ASP which provides the SPs own sounds through a system similar in appearance to that of the ASP; and diagnosis of pathologies indicated by abnormal ASP findings.)

This innovative approach to assessment using augmented standardized patients to assess heart/lung auscultation skills will expand the range of physical abnormalities that can be tested in SP-based assessments.

University of British Columbia

Principal Investigator: Rose Hatala, MD, MSc
Grant Amount / Duration: $34,880 / 1 year
Project Title: Integrating Simulation Technology into a National Specialty Examination in Internal Medicine
[–] Click HERE to close Abstract

Rationale
As part of the assessment of clinical performance during the Canadian national specialty examination in internal medicine, candidates' physical examination skills are tested in a series of bedside stations. At each station, a candidate performs a focused physical examination on a standardized patient. Since 2003, we have integrated simulation technology into the physical examination stations in order to test candidates' ability to recognize common internal medicine physical abnormalities.

Objectives

  1. To establish the relationship between competence in physical examination as assessed using simulation technology compared to real patients.
  2. To assess whether physical exam technique is a separate competency from the recognition of abnormalities on physical examination.

Methods
Internists' physical examination skills and diagnostic accuracy on real patients and simulations will be assessed during a 10 station OSCE. The OSCE will consist of 5 stations using patients with real cardiac abnormalities and 5 stations using standardized patients lacking physical abnormalities combined with audio-video simulation of cardiac auscultatory abnormalities.

Contribution to Assessment
Our integration of simulation technology into a high-stakes assessment of clinical performance is a novel contribution to the field of assessment. In addition, we will examine the relationship between the transfer of physical examination skills between simulations and real clinical performance, which has not been previously established. Our approach to integrating simulation technology into an examinee's patient assessment may be generalized to other testing formats and settings.

University of Illinois at Chicago College of Medicine

Principal Investigator: Rachel Yudkowsky, MD, MHPE
Grant Amount / Duration: $69,290 / 1.5 years
Project Title: Validation of a Hypothesis-Driven Physical Exam Assessment Procedure
[–] Click HERE to close Abstract

In contrast to current checklist-based SP assessment procedures, that focus primarily on assessing physical exam maneuvers or history taking, the proposed hypothesis-driven assessment procedure brings together all key elements of physical diagnosis, namely generating a limited set of diagnostic hypotheses, anticipating discriminating findings, performing maneuvers and appreciating the findings, and interpreting the finding by proposing a working diagnosis. The assessment task requires students to think in action, while gathering the data. The findings from the scientific literature that were used to build this assessment procedure, namely co-selection, prototypes, discriminating features, and transfer, provide a strong conceptual framework for the proposed procedure. By implementing this approach as an assessment procedure, it also automatically guides learning (knowing that students learn what they are assessed on). It promotes contextualized, integrated, and meaningful learning, and provides, as advocated by medical educators, a more parsimonious, selective approach to physical diagnosis, focusing on key, discriminating findings as well as an array of structural patterns (diagnostic sets) that can facilitate transfer when students go from pre-clinical to clinical settings and from patient to patient. The procedure is based on 18 complaints, 145 physical exam maneuvers, and 59 diagnostic alternatives, a sound foundation upon which students can build their physical diagnosis. The student and class profiles generated from this procedure provide a well-organized and detailed framework for providing feedback to students and educators, where various sources of strengths and weaknesses in physical diagnosis can be parceled out, such as distinguishing anticipation errors from execution or interpretation errors (an important asset in an era of reducing medical errors). An example of a student profile following a case would include: "Good anticipation of clinical findings, some faulty physical exam maneuvers, and incorrect diagnosis." Finally, the assessment procedure and the various scores derived from the observations, such as anticipation scores, diagnostic interpretation scores, and overall physical exam scores (8 profiles), offer the possibility of better distinguishing among levels of expertise. The purpose of this proposed project is to begin to validate this hypothesis-driven assessment procedure for physical diagnosis of medical students and residents. Both a three-step and a four-step procedure will be studied, where the four-step procedure includes generating hypotheses while the three-step procedure does not. Six pilot testing and validation studies are proposed, each testing various aspects of construct validity and reliability:

C-I Pilot testing the materials and 3-step procedure with M3 students
C-II Content validation with a blue ribbon panel of clinical educators
C-III Estimating reliability, feasibility, and consequential validity of the 3-step procedure
with M3 students
C-IV Estimating reliability and learning effects with early M4 students
C-V Pilot testing 4-step procedure and estimating reliability with PGY-1 & -2 residents
C-VI Estimating expert-novice differences.

Reliability will be assessed with G and D generalizability studies; feasibility using time on task and reliability data; consequential validity using a questionnaire; and instructional feedback from the assessment profiles generated using observational data. (A group of Japanese educators are testing the three-step procedure with pre-clinical students.) The main strengths of the proposed hypothesis-driven assessment procedure are its sound theoretical foundation, its relative procedural simplicity (including 3 and 4 steps), and its potential for informative and structured feedback to students and educators, and in distinguishing levels of expertise.

Invitational Program

Jefferson Medical College of Thomas Jefferson University

Principal Investigator: Mohammadreza Hojat, PhD
Grant Amount / Duration: $99,957 / 2 years
Project Title: General and Specific Subscales of the Jefferson Scale of Physician Lifelong Learning: Predictors and Outcomes
[–] Click HERE to close Abstract

Lifelong learning is an essential element of professionalism. In response to the demand for an operational measure of lifelong learning, we developed the Jefferson Scale of Physician Lifelong Learning (JSPLL, 19 Likert-type items). By surveying 444 physicians from the Greater Philadelphia region we provided evidence in support of the psychometric properties of the JSPLL in a previous research study supported by the NBME Stemmler Fund. Nonetheless, the following three questions remain to be addressed by using a nationwide sample of physicians:

I. Is it feasible to generate a general (G) and a specific (S) component (subscale) of the JSPLL, each applicable to a different group of physicians? The two G and S components of the JSPLL will be identified based on the results of factor analysis and content analysis, so that the G component will be more applicable to physicians in patient care who are not involved in teaching and research activities (Group 1), whereas the S component will be applicable to academic physicians who are involved in teaching or research in addition to clinical responsibilities (Group 2). The feasibility of generating the G and S components will be addressed by comparing their psychometric properties and differential validity for physicians in Group 1 and Group 2.

II. What are the predictors of physician lifelong learning? We will examine the contribution of the following measures in predicting the JSPLL (components and total) scores: Academic performance prior to medical school (MCAT, undergraduate GPAs), during medical school (performance in the basic and clinical sciences, rating of clinical competence in core clerkships), scores of the medical licensing examinations (Steps 1, 2, and 3, of the USMLE, formerly Parts I, II and III of the NBME), and ratings of postgraduate clinical competence in three areas of "data gathering," "interpersonal skills," and "socioeconomic aspects of patient care."

III. What are the professional outcomes of physician lifelong learning? We will examine the associations between the JSPLL (components and total) scores and professional outcomes such as board certification, employment status, satisfaction with career, work setting, patient load, teaching, research, publications, and other practice variables. A survey will be mailed to a nationwide sample of 5,412 physicians who graduated from Jefferson Medical College between 1975 and 2000. Multivariate statistical analyses (MANOVA and regression) will be employed. The study will lead to a better understanding of the predictors and professional outcomes of lifelong learning, and a refined assessment instrument useful for the evaluation of lifelong learning among different groups of physicians.

2003–2004 Grantees

Call for Proposals

Duke University Medical Center

Principal Investigator: Melanie C. Wright, PhD
Grant Amount / Duration: $69,718.00 / 2 years
Project Title: Assessment and Prediction of Teamwork
[–] Click HERE to close Abstract

Problems with communication and team coordination are frequently linked to adverse events in medicine. Researchers in the health care industry are increasingly aware of the importance of teamwork skills and advocate a wide variety of training programs related to team coordination. These efforts are prevalent in dynamic environments such as the emergency department and operating room and tend to be focused toward specialty and continuing education. While the assessment of medical students has covered areas such as interpersonal and communication skills, these assessment measures generally focus on the student's interaction with the patient and do not assess team skills in relation to working with other health care providers. Efforts to understand and assess team performance in other work environments have resulted in the identification of specific skills that are important to good teamwork and methods for assessing these skills.

We propose to evaluate assessment tools used in other team performance contexts for the measurement of medical student teamwork skills within a small group cooperative learning environment and in a simulated patient care environment. Specifically, we hope to answer the following questions: (1) Will assessment tools used in other team performance contexts adequately assess individual medical student team skills? (2) Can these skills be assessed in naturally occurring team learning environments? (3) Do the results of the teamwork skills assessments reflect actual team performance or outcome in scenarios using a human patient simulator?

Assessment measures to be evaluated include self rating of team skills, peer rating of team skills, observer ratings of team skills, and analysis of communication content. We will first refine these assessment methods for application in the medical education environment. We will design and conduct a seminar covering team coordination principles for first year medical students. Approximately 30 medical students will be video and audio taped over four small group problem based learning sessions. Raters will use a tool designed to count specific types of communications and behaviors to code and then rate each student's performance. The same students will also be assessed in two patient care scenarios using a human patient simulator. The scenarios will be designed to require the coordination of three students with defined roles. We will compare results of measures in the small group and simulated team exercise to determine degree of relationship. In addition, we will investigate the relationship between individual team skill assessment measures and objective measures of team performance in the simulated scenario to determine whether the skill assessment measures are predictive of actual care performance.

We suggest that early training in team coordination would allow for more time and experience in practicing these skills and may help influence more positive habits and attitudes toward working in a team environment. If such training is incorporated in medical schools, practical assessment methods will be required to determine the efficacy of the training. This project provides an initial assessment of several measures to determine both convergent and predictive validity for the assessment of team skills in medical students.

McMaster University

Principal Investigator: Kevin W. Eva, PhD
Grant Amount / Duration: $67,720.00 / 2 years
Project Title: Development and testing of an innovative admissions OSCE (The Multiple Mini-Interview) for assessing non-cognitive key competencies in medical school candidates
[–] Click HERE to close Abstract

Rationale
While the medical profession continues to value non-cognitive variables such as interpersonal skills and professionalism, it is not clear that current evaluation tools, particularly those used during admissions protocols, are capable of reliably assessing ability in these domains. Hypothesizing that many of the problems with tools like the personal interview might be explained, at least in part, by context specificity afflicting the accuracy of assessments of non-cognitive abilities, we have developed a multiple sample approach to the measurement of these competencies and propose further study of this innovation.

The Multiple Mini-Interview (MMI) consists of short OSCE-style stations in which examinees are presented with scenarios that require them to discuss a health related issue (e.g., the use of placebos) with an interviewer, interact with a standardized confederate while an examiner observes the interpersonal skills displayed, or engage in a problem-solving exercise with another examinee. The tool has proven reliable on three separate administrations in which both graduate students and applicants to the undergraduate medical program at McMaster University participated.

Objectives

  1. To determine the predictive validity of the MMI
  2. To examine the impact of rater training and background on ratings assigned during the MMI
  3. To assess the potential outcome of a security breach after implementation of the MMI

Methods
The class of 2005 at McMaster University includes 48 students who participated in the first pilot study of the MMI prior to entry into medical school. To satisfy the first objective we propose to collect data pertaining to their performance within the medical program (prior to graduation these students will sit 8 multiple choice question examinations of medical knowledge, 4 clinical reasoning exercises, 3 OSCEs, and a series of tutorial/clinical skills evaluations) and licensure (a month subsequent to graduation these students will write Part I of the Medical Council of Canada's Licensing examination (LMCC Part I). Regression analyses will be performed. In addition, we propose to mount a mock MMI for current medical residents to allow for a comparison between (a) scores assigned to medical school applicants and more senior individuals nearing completion of their training, and (b) scores assigned within the MMI to those received on Part II of the LMCC - Part I is intended to assess primarily medical knowledge while Part II consists of an OSCE that allows for a greater opportunity to assess non-cognitive characteristics of medical trainees. Finally, we propose to utilize a pair of experimental designs to determine the impact of both interviewer training and test security breaches on the psychometric properties of the MMI and the resulting assessments.

Significance for Medical Education and Practice
This innovative assessment tool, if it proves valid, is expected to improve the ability of medical programs and licensing bodies to assess the non-cognitive characteristics of both new applicants and medical professionals. In addition, we anticipate using this line of research to highlight the importance of multiple sampling approaches to assessment for overcoming the limitations context specificity places on evaluation exercises in general.

Columbia University - College of Physicians and Surgeons

Principal Investigator: Mark J. Graham, PhD
Grant Amount / Duration: $70,000.00 / 2 years
Project Title: Systems-based Practice: Development of a Measure to Assess Competency
[–] Click HERE to close Abstract

This study proposes to use rigorous methodology to achieve the following: 1) to develop a well elaborated taxonomy of the specific knowledge, skills, attitudes, practices, behaviors and measurable outcomes associated with the ACGME competency - Systems-based Practice (Year 1); and 2) to develop and pilot test a global rating scale for assessing Systems-based practice based on the taxonomy. This will address one of the major challenges in assessment facing residency training programs throughout the country. During the first year we will use a well researched methodology - Nominal Group Process (Hall, 1983; Delbecq, et., al., 1975; Van de Ven, et., al., 1972) to arrive at consensus opinion that characterizes all of the key elements and outcomes associated with Systems-based Practice. Specifically we will obtain three key perspectives by running separate nominal groups with physicians, members of the healthcare team (e.g., Nurses, translators, social workers etc), and key administrators. Results of nominal groups will be validated by running 2-4 separate groups until adequate consensus is reached. This will result in a comprehensive taxonomy of the sub-competencies comprising Systems-Based Practice that can guide development of curriculum and valid assessments.

From these aggregated responses (blueprint) a global rating scale will be developed to measure Systems-based practice competency across different medical domains. Scale items will be based on the taxonomy and developed by a team of experts. The instrument will be piloted in two major residency programs in the New York Presbyterian Hospital system. Reliability and inter-rater agreement will be assessed. The outcome of this study will be a multidimensional global rating scale for Systems-based Practice that can reliably measure the key aspects of this competency. The scale should be a cost-effective and feasible template, with items designed to span across medical domains, in order to evaluate resident's core capabilities.

University of Michigan

Principal Investigator: Larry Gruppen, PhD
Grant Amount / Duration: $69,765.00 / 2 years
Project Title: Assessing clinical teaching with standardized students: a feasibility and validity study
[–] Click HERE to close Abstract

Because excellent clinical teaching is important to the development of knowledge and skills in learners, the evaluation of clinical teaching is a central assessment activity at most medical schools. It has also become a component of many promotion and tenure decisions. Although teaching evaluations are necessary for these activities, there is considerable dissatisfaction with the utility of traditional student ratings of teaching. While innovative evaluation methods have been developed, few have gained wide use and many fail to incorporate the learner's perspective on teaching performance.

In an effort to augment the evaluation tools available for assessing clinical teaching, we pro­pose to extend the Standardized Student methodology from its typical educational application to an evaluation application. Derived from the widely utilized Standardized Patient methodology, Standardized Students (SSs) are medical students trained to portray teaching problems for faculty. In educational applications, these problems are used to stimulate faculty development in teaching skills. We will train 30 SSs not only to portray teaching problems, but also to critically evaluate teaching performance in order to transform them into a pool of trained evaluators of clinical teaching.

The psychometric characteristics of Standardized Students as evaluators of clinical teaching will be examined in three studies. The first will assess the inter-rater reliability of SSs as they review and evaluate the teaching performance of videotaped faculty-student interactions. The results of this study will enable us to refine the technology and clarify the sources of variance in SS evaluations.

The second study examines the validity of SS evaluations of clinical teaching in the context of a proven faculty development intervention. Using a pre-post intervention design, we will use a set of six SSs for each faculty member to measure teaching performance before and after the intervention. Validity of the SS technology will be demonstrated by its ability to measure changes in teaching performance resulting from this intervention.

The third study also examines validity, but this time in the context of routine clinical teaching. The 30 SSs and their classmates will take their third-year, required clinical rotations and provide teaching evaluations on a specified set of target faculty who have the greatest responsibilities for teaching in these clerkships. The SSs will evaluate teaching according to the dimensions and criteria for which they have been trained, while their classmates will use the traditional student ratings of global teaching skills. These two sources of data will be compared to identify common dimensions of assessment and novel dimensions or characteristics measured by the SSs.

Through these studies, we will obtain both valuable experience in the logistics of using SSs as evaluators and critical information on the psychometric properties of these evaluations Standardized Students may hold promise for more specific and useful information on the quality of clinical teaching in medical schools.

Invitational Grants

University of Missouri-Kansas City School of Medicine

Principal Investigator: Louise Arnold, PhD
Grant Amount / Duration: $70,264.00 / 1 year
Project Title: Towards Assessing Professional Behaviors of Medical Students Through Peer Observation: A Multi-institutional Study
[–] Click HERE to close Abstract

Rationale
The assessment of professional behavior in medical students is one of the most challenging tasks facing medical educators today. Among the array of assessment methods being investigated, peer evaluation appears to be one of the most promising. Unfortunately, the social climate surrounding peer evaluation may affect its acceptability in the eyes of students and thereby depress the reliability and validity of their assessments. In addition, the typical approach to conceptualizing professional and unprofessional behavior as an expression of stable characteristics of learners -- such as honest or dishonest -- may heighten the reluctance of medical students to report unprofessional behavior. Thus, for medical students to find peer assessment acceptable, the concept and indicators of professionalism on which the assessment rests must be grounded in the peers' ideas about professionalism, the value conflicts they experience, and the situations in which they live as medical students.

Project Objective(s)
The primary objective of this project is to determine the context in which peer assessment can occur across medical schools and year levels. Specifically, do students see similar kinds of professional actions in their peers, do they take similar actions in response to these observations, and would they agree to participate in peer assessments of the same kind across schools and year levels? If not, what are the characteristics of schools or systems in which peer assessment is possible?

Methods
Using the form adapted from our initial grant, we will survey all students at eight medical schools across a range of geographic and institutional characteristics. In addition to using items from our initial survey, we will add items from a survey of institutional professionalism climate to identify characteristics of schools that might promote or prevent effective peer assessment. Surveys will be administered to each class at each school by electronic or paper means. Data will be collated and analyzed at a central location. The data will be analyzed first by using descriptive statistics and then inferential statistics to detect potential differences between institutions and among students from each year level. Factor analysis will be used to determine how various aspects of peer assessment systems across schools might be related.

Contribution to Assessment
From our preliminary work, we understand that students are willing to engage in peer assessments of professionalism provided the appropriate institutional support, anonymity, faculty oversight, timely evaluation, appropriate counseling or commendation of peers, and protection for the student evaluator are present. An important next step will be to explore the extent to which the results based on students' responses in two schools generalize to students in other institutions and thereby to deepen our understanding of the interactions between school climate and the use of peer assessments.

Wake Forest University School of Medicine

Principal Investigator: George Nowacek, PhD
Grant Amount / Duration: $99,550.00 / 2 years
Project Title: Expanding a model and assessment of professionalism in medical students.
[–] Click HERE to close Abstract

Professionalism in medicine continues to be threatened by changes in the organizational and financial structure of medical practice brought about by managed care, outcomes-based medicine, and pressure for unionization. In the past several years, there have been three new definitions of medical professionalism that reflect this continuing concern: The ABIM Foundation, ACP-ASIM Foundation and European Federation of Internal Medicine: Physician Charter; the NBME Center for Innovation: Behaviors of Professionalism; and the ACGME General Competency: Professionalism. While these definitions have not changed the boundaries of medical professionalism by incorporating new attributes or expectations, they represent considerable professional time devoted to their preparation and, thereby, their level of concern.

In medical education, there also continues to be great interest in professionalism in medical students, particularly in the assessment issues. However, not much progress has been made in teaching and evaluating professionalism in medical schools as documented by a repeat survey in 2002 of medical schools when compared to results from a 1997 survey. A comprehensive review of the state of professionalism assessment also underscored the need for continued efforts to expand and validate existing measures.

The Wake Forest University School of Medicine received a 2-year NBME Stemmler Medical Education fund award in June 2001. The goals for this award were to develop a model and the measures to assess professionalism in medical students. The model provides a structure for behavioral assessment of professionalism. The definition of professionalism by Swick, detailing a taxonomy of behaviors of professionalism appropriate for medical students, was used as the basis for developing the behavioral measures for the model. The construct of professionalism development has been the foundation of the assessment model and was refined during the initial project. Professionalism development is conceptualized to be a single dimension scaled from very low to very high and that students can be placed at any point on this scale. The metric is calculated by combining performance values across the behavioral assessments and over the multiple attributes of professionalism.

The progress of our initial project included modifications of the model that revised four latent constructs of knowledge of professionalism, attitudes toward professionalism, observations of professionalism in the preclinical curriculum, and professionalism behaviors in the clinical setting. The report also detailed the development and pilot-testing of 13 measures for the latent constructs using the methodologies of knowledge testing, attitude assessment, faculty observations, peer assessment and standardized patient ratings. Two collaborating medical schools participated in the development and pilot testing of the measures.

The goals for the present study are, first, to complete the validation of the model with structural equation modeling. The analysis of longitudinal data from two student cohorts will be completed and the transportability of the measures will be established by participation of two collaborating medical schools. The second goal is to establish the validity of those few measures that provide reliable assessments of students' professionalism development that are grounded in student behaviors. A supplemental study will investigate the possibility of assessing an understanding and commitment to medical professionalism in the admissions interview.

2002–2003 Grantees

Call For Proposals

Jefferson Medical College of Thomas Jefferson University

Principal Investigator: Mohammadreza Hojat, PhD
Grant Amount/ Duration: $34,865.00 / 1 year
Project Title: An Operational Tool for Assessing Physician's Lifelong Learning
[–] Click HERE to close Abstract

Abstract
Lifelong learning is required in medicine to stay abreast of scientific advances and rapid developments in the medical sciences and biomedical technology. Despite the importance of physicians' lifelong learning, no psychometrically sound instrument has been developed to assess it. The purpose of this project is to develop an operational tool for assessing physicians' lifelong learning habits, activities and professional outcomes. In particular, we plan to address several psychometric aspects of a lifelong learning scale such as face and content validities, construct validity (underlying components of the lifelong learning scale), criterion-related validity (convergent and discriminant validities), internal consistency aspect of reliability (Cronbach's coefficient alpha), stability of the scores over-time (test-retest reliability), and relationship with outcomes associated with scores of the lifelong learning scale.

Based on an extensive review of relevant literature and based on the results of three pilot studies, we have developed a 19-item scale (Jefferson Scale of Physician Lifelong Learning) intended to measure physician's lifelong learning. In the present project, we plan to expand our previous studies by further investigation of the psychometric properties and measurement characteristics of this scale administered to a large number of physicians (n > 400). Once the psychometrics of this scale are established, a percentile score distribution table will also be provided for comparative purposes. This research tool can be used to assess physicians' lifelong learning habits, activities and outcomes; to assess group differences among physicians (e.g., by demographic characteristics, practice specialties, type of degrees [MD compared with MD-PhD, etc.] on underlying factor scores or the total scores of the lifelong learning scale. The scale will allow us to assess outcomes of different educational programs on physicians' lifelong learning (e.g., problem-based learning versus conventional medical school curriculum). The scale can also serve to measure an important aspect of "professionalism" in medicine defined as lifelong learning.

At the completion of this project, we will have developed a multidimensional scale of lifelong learning supported by extensive psychometric evidence for the validity and reliability of the scale. This is consistent with the stated goal of the Stemmler Medical Education Research Fund of the National Board of Medical Examiners, by developing an assessment tool that will serve researchers in the evaluation of those preparing to or continuing to practice medicine, as well as in the assessment of medical school curriculum and residency training programs designed to improve lifelong learning skills and habits among medical students and residents.

University of Toronto

Principal Investigator: Shiphra Ginsburg, MD, MEd, FRCPC
Grant Amount/ Duration: $69,805.74 / 2 years
Project Title: Translating Theory into Practice: Towards an Authentic Assessment of Professional Behavior and Reasoning
[–] Click HERE to close Abstract

Abstract

Background
Teaching and evaluating professionalism has become a major focus in health professional education. Previous attempts at evaluation have failed to consider professionalism as a set of behaviors in context, and there is often insufficient exploration of the reasons why students enact certain behaviors over others. We have conducted and published a series of qualitative studies to address these issues, and to build a theory to explain students' perceptions of, and reasoning strategies in response to professional dilemmas. Our most recent work used standardized, video-taped scenarios of professional dilemmas in order to assess students' reasoning strategies in "real time", and found that students were frequently motivated to act by reference to Principles or Implications, some of which (e.g., implications for self) are disavowed in the formal curriculum. The study proposed here will answer several key questions relevant to translating this theoretical framework into an innovative method of assessment.

Objectives
(1)To create authentic, text-based scenarios that describe professional dilemmas from the student point of view; (2) To determine if students' reasoning strategies can be accurately revealed in an examination (vs. a research) context, and to assess for any effects of scenario format (video vs. text); and (3) To determine what attending staff physicians (ASP's) perceive as professional/ unprofessional (or pass/fail) responses from students in a written exam setting, specifically focusing on what factors they weigh in assigning their grades.

Methods
This study uses a combination of qualitative and quantitative methods to address each of the above objectives. Step one will create the text-based scenarios from the videotapes. In step two, 60 medical students will be recruited and randomized into two groups: one group will view the videos, one will receive text. Each student will answer, in writing, a series of questions related to the scenarios (e.g., Describe in detail what the student should do next, why should the student do that, etc.) Responses will be analyzed qualitatively and quantitatively and will be compared with responses obtained in a previous study (a "non-exam" setting). This will determine whether a written exam (as compared to a research) setting can allow insight into key, authentic aspects of student reasoning, and whether text-based scenarios provoke different responses than videos. In step three, 20 Asp's will be asked to grade students' responses from step two, and will each be interviewed regarding the factors they weighed in assigning their scores. Results will be analyzed qualitatively and quantitatively to determine the relative importance placed on the actions proposed by students, the reasoning strategies they described, or other factors.

Implications for Assessment
At the completion of this study, we will have determined whether an innovative written exam setting using standardized scenarios can allow insight into key, authentic aspects of students' reasoning. If responses appear crafted rather than authentic, we may be missing the most dominant influences on students' reasoning, and therefore will be unable to provide appropriate feedback. It is also anticipated that we will have enough information from Asp's grading decisions to develop a scoring template for future use. These preliminary data will also serve as a basis for designing future studies to address issues of reliability, validity, and feasibility, and the educational value of such an exam.

The University of Iowa

Principal Investigator: Geb Thomas, PhD
Grant Amount/ Duration: $69, 540.00 / 2 years
Project Title: Project Evaluating the Breast Examination Simulator as a Tool for Clinical Breast Examination Skill Assessment
[–] Click HERE to close Abstract

Abstract
Clinical Breast Exams (CBEs) are an important tool for breast cancer screening. However, most health care professionals lack confidence in their clinical breast exam skills and many report that their training in this technique is inadequate. This project will refine an existing prototype dynamic silicon breast examination simulator, and test its effectiveness in assessing clinical breast examination skill with a group of clinical breast examination specialists in the Ontario Breast Examination Program.

The project will evaluate two hypotheses. The first is that performance with the Dynamic Breast Examination Simulator correlates with performance on clinical breast exams. The second hypothesis is that retesting with the Dynamic Breast Examination Simulator accurately measures performance improvement over time. Five project objectives will be achieved to test these hypotheses. 1) Refine the existing dynamic breast model. 2) Test half of the expert and novice clinical breast examiners with the dynamic breast model. 3) Correlate clinical data regarding the skill level of the breast examiners with their performance on the dynamic breast model. 4) validate the predictive results of the assessment by testing the second half of the expert and novice clinical breast examiner group. 5) Retest the novice examiners from the first year's protocol and confirm that the retest accurately measures performance improvement over time.

The project will advance assessment in medical education by providing both a useful tool for training and assessing the dexterity and pattern recognition skills required for effective clinical breast exams. The device will also introduce a novel electromechanical design that may be adapted to other physical examination palpation skills.

University of Massachusetts Medical School

Principal Investigator: Michele P. Pugnaire, MD
Grant Amount/ Duration: $70,000.00 / 2 year
Project Title: Using Standardized Patients to Assess Professionalism: A Comparative Analysis of Two Approaches
[–] Click HERE to close Abstract

Abstract
Clinical Breast Exams (CBEs) are an important tool for breast cancer screening. However, most health care professionals lack confidence in their clinical breast exam skills and many report that their training in this technique is inadequate. This project will refine an existing prototype dynamic silicon breast examination simulator, and test its effectiveness in assessing clinical breast examination skill with a group of clinical breast examination specialists in the Ontario Breast Examination Program.

The project will evaluate two hypotheses. The first is that performance with the Dynamic Breast Examination Simulator correlates with performance on clinical breast exams. The second hypothesis is that retesting with the Dynamic Breast Examination Simulator accurately measures performance improvement over time. Five project objectives will be achieved to test these hypotheses. 1) Refine the existing dynamic breast model. 2) Test half of the expert and novice clinical breast examiners with the dynamic breast model. 3) Correlate clinical data regarding the skill level of the breast examiners with their performance on the dynamic breast model. 4) validate the predictive results of the assessment by testing the second half of the expert and novice clinical breast examiner group. 5) Retest the novice examiners from the first year's protocol and confirm that the retest accurately measures performance improvement over time.

The project will advance assessment in medical education by providing both a useful tool for training and assessing the dexterity and pattern recognition skills required for effective clinical breast exams. The device will also introduce a novel electromechanical design that may be adapted to other physical examination palpation skills.

Invitational Grants

University of California San Francisco

Principal Investigator: Maxine A. Papadakis, MD
Grant Amount/ Duration: $100,000.00 / 2 years
Project Title: A Collaborative Study to Determine the Generalizability of Professionalism Deficiency during Medical School as a Predictor for Subsequent Disciplinary Action by a State Medical Board
[–] Click HERE to close Abstract

UCSF received a NBME Stemmler Medical Education Research Fund award in June 2002. Our objectives in this pilot study were to determine whether there were variables in medical school performance that were predictive of subsequent disciplinary action by a state medical board. We also wished to determine what happened to medical students who displayed unprofessional behavior in medical school. Lastly, we hoped to validate our existing professionalism evaluation system by providing outcomes on those students identified as having deficiencies in professionalism under this system. To this end, we conducted a case-control study of all UCSF physician graduates disciplined by the Medical Board of California from 1990-2000 (n = 68). Control graduates (n=196) were matched by medical school graduation year and specialty choice. We concluded that problematic behavior in medical school, but not the more traditional measures of medical school performance, is associated with subsequent disciplinary action by the state medical board. This finding adds validity to the assessment of professionalism in medical school as well as to UCSF's professionalism evaluation system. We now wish to determine the generalizability of our findings.

The hypothesis of this study is that physicians disciplined by a state medical board demonstrated unprofessional behavior while in medical school.

The objectives of this study are:

  1. To determine whether unprofessional behavior in medical school predicts disciplinary action by a state medical board in a national sample.
  2. To test model fit of the model derived from the UCSF pilot study in two other institutions.

We propose to perform a case-control study (n = 500) similar in design to the UCSF pilot study in graduates of the University of Michigan Medical School and the Jefferson Medical College. An exceptional group of collaborators, Drs. David Stern, Susan Rattner and her colleagues who work with the Jefferson Longitudinal Tracking System have committed to this research effort. Data will be abstracted at the participating institutions as well as at UCSF, where the data analyses will occur.

The results of this generalizability study will be to add validity to the assessment of professionalism and to the practice that attainment of professionalism must occur for a student to graduate from medical school. Medical school promotions committees will have outcome data on which to make decisions related to student promotion and unprofessional behavior. Lastly, our results will provide data to medical school admissions committees about specific personal and professional characteristics that must be balanced against traditional markers of achievement in applicants.

2001–2002 Grantees

Call For Proposals

University of California, San Francisco

Principal Investigator: Maxine A. Papadakis
Grant Amount/ Duration: $69,889.00/ 1 year
Project Title: Case-Control Study of Professionalism Problems in Medical School as a Risk Factor for Physician Discipline
[–] Click HERE to close Abstract

Abstract
Our previous work has focussed on the evaluation of professionalism deficiencies in medical students. We have developed criteria for the evaluation of professionalism and have developed an approach to this domain. We now hypothesize that lack of professionalism in medical school is associated with related deficiencies in medical practice. To our knowledge, there are no studies that examine performance characteristics in medical students that predict discipline by medical boards, or, how physicians disciplined by a state medical board performed while in medical school.

We have developed an exciting working relationship with the Medical Board of California that will permit us to test our hypothesis. Specifically, we ask to study medical student performance related to professionalism, as well as grades and test scores, and we will compare those to disciplinary actions by the state medical board. If we find that problems with professionalism in medical school correlate with subsequent disciplinary action by the medical board, it would support the work of educators to concentrate remediation efforts on those students and to understand the consequences of unsuccessful remediation. Medical school promotions committees would also have outcome data on which to make decisions related to student promotion and unprofessional behavior.

The hypothesis of this study is that physicians disciplined by a state medical board
demonstrated unprofessional behavior while in medical school.

The objectives are to determine:

  1. Whether unprofessional behavior in medical school predicts disciplinary action by the Medical Board of California
  2. The variables in medical school performance predictive of disciplinary action by the Medical Board of California

We propose to perform a blinded, case-control study of all UCSF School of Medicine graduates who have been disciplined by the Medical Board of California since 1990 (n=70). Controls will be UCSF School of Medicine graduates matched to cases within one year of graduation and specialty (n=210). For objective #1, we will abstract all negative excerpts about students' professional attributes from course evaluations and the Dean's letter of application to residency. The negative excerpts (e.g. "resistant to constructive feedback", "needs reminders to fulfill ward responsibilities") will be assigned to one of five categories: 1) Good (no negative comments); 2) Trace (occasional minor negative comment); 3) Concern (problematic comments from a course); 4) Problem (problematic comments from two or more courses); and 5) Extreme ("society must be protected from this student"). For objective #2, we will examine the medical student variables of year and age at graduation, gender, undergraduate grade point average, MCAT scores, grades in required medical school courses, NBME Part 1 scores, and presence of academic probation.

The association between the predictor and outcome variable (board disciplinary action) will be analyzed by a Wilcoxon rank sum. The predictors of disciplinary action will be determined by logistic regression analyses. Results will be described as odds ratios for each independent variable. Once this study is completed, we plan to validate the predictor variables derived at UCSF at additional California medical schools.

University of Montreal

Principal Investigator: Bernard Charlin MD
Grant Amount/ Duration: $31,482.00/ 1 year
Project Title: The effect of variability of answers among criterion experts to detect expertise with a SC test
[–] Click HERE to close Abstract

Abstract
The Script Concordance Test (SCT) is a new tool of clinical reasoning assessment. It has a rich context in a case-based format. Items are made from the questions and actions physicians actually ask and make in clinical practice. The test probes organization of knowledge through requests to interpret data presented in the context of authentic clinical tasks. Inferences are made from examinee scores about the degree of knowledge elaboration required to successfully address problems in the assessed domain. The tool uses an aggregate scoring method that reflects the response variability experts demonstrate when they reason in clinical situations. Scores on each item are derived from the answers given by a criterion group of experts. The meaning of the variation of criterion experts' answers to items as a way to detect clinical expertise is an important research issue. No research has been conducted to formally study what amount of variability optimizes the discriminative power of items and the discriminative power of the test as a whole.

The project will be held in the family medicine domain. It will implies 4 phases (1) test construction and validation of items; (2) item selection and answer key generation; (3) production of a final test, according to item variability; (4) administration of the final test to two contrasted groups clerkship students and experienced physicians.

We will study if variation of expert's answers used to establish the aggregate scoring key influences the ability of items (and of the test) to differentiate expertise. A coefficient of expert variability will be computed for each item. The discriminative power of items will be operationalized as the effect size of the item. A scattergram of effect size (y axis) according to variability coefficients (x axis) will expose the relationship between the two variables. We expect a curvilinear relationship (inverse U shape relation) between item indexes of discrimination and of expert answer variability, with maximum discrimination for items in the middle range of expert answer variability.

The scores on the global test (sum of the 90 items) for both groups will be compared with a simple t test procedure. To further study the effect of variability of items of the discriminative power of the test, a two-way analysis of variance will be used, with group as a between-subject factor and variability of items (according to the four groups of variability) as a within-subject factor. A group effect is expected across all variability categories and an interaction effect is expected, discrimination of group will depend of variability groups: medium variability should lead to higher discrimination than extreme categories.

Expected fall out of the project are:

  • To provide evidence about the adequacy of theories that underpin the test; notably the relationship between variability of answers in professional tasks and its importance to detect expertise in assessment situations.
  • To provide guidelines for items selection in script concordance test elaboration.

University of Kentucky Research Foundation

Principal Investigator: Charles H. Griffith III, PhD
Grant Amount/ Duration: $68,844.00/ 1.5 years
Project Title: Understanding the Association of Teaching and Learning: A National Study of Moderating Variables
[–] Click HERE to close Abstract

Abstract

Project Objectives
The purpose of this project is to extend the breadth and depth of our previous work documenting an association between better clinical teaching and enhanced student learning. Our specific project objectives are: 1) To extend the generalizability of our findings that better clinical teaching is associated with enhanced student learning; 2) To extend the generalizability of our approach to identifying those high-quality teachers who are associated with better student outcomes; 3) To understand what contextual variables (e.g. clerkship structure, nature of teacher-student interaction, setting, etc) are most important for teaching to be associated with better student outcomes.

Rationale
For the most part, the fundamental outcome of teaching has been left unstudied: that is, does the quality of teaching actually influence student learning? For our recent NBME supported project, we reported the first documentation that medical students in their internal medicine clinical clerkship who work with one of the "best" clinical teachers score significantly higher on post-clerkship examinations and on USMLE II, controlling for prior student academic achievement. The major limitation of our previous work is that it represents teaching and learning from a single site and our local clerkship structure. Can learner outcomes be linked to individual teachers across institutions? And further, is our approach to classifying the "best"
teachers generalizable to other sites? If we are to suggest applications and implications of our findings, we need to document the exportability and generalizability of our method of identifying "best" teachers. And even further, our studies have identified a measurable link between teaching and learning, but our methodology has not allowed us to identify crucial contextual and environmental factors which must be in place for teaching to prove influential. For example, will teaching prove to be associated with better learning outcomes in other clerkship structures, with other attending-student interactions?

Methods
The first phase of our study will be to extend the generalizability of our approach to identifying "best" clinical teachers across our 19 collaborating institutions. Our approach involves convening a consensus panel of residents who had formerly been students at these institutions, with these residents classifying clinical faculty at their institution as "best", "medium" or "low" in their clinical teaching ability. Our outcomes of interest will be student's score on post-clerkship NBME examinations and on USMLE II. A large data set will be assembled on approximately 2600 students and 1300 faculty from these institutions, noting student characteristics, characteristics of their clinical teachers, the teaching category of the
attendings they worked with, and institution-specific contextual variables (structure of clerkships, settings, etc). Regression approaches will allow us to document the association of attending physician teaching "category" with the outcomes of their students, controlling for prior-academic performance (USMLE I Score) and other variables. In addition, we will be able to identify what contextual characteristics (e.g. clerkship structure, etc.) are associated with enhanced student's performance.

Significance
Our study will provide support for considering using learner outcomes as a measure of teaching ability. For example, learner outcomes may be important additions to teaching portfolios and promotion and tenure dossiers for clinician-educators. Our methodology could be extended into the faculty development literature, with learner outcomes a marker for the effectiveness of faculty development programs. But even moreso, our findings will establish the generalizability of our previous work, and document in a quantitative fashion the critical importance of the educational mission, that in an era of increased cost accountability that student learning would be jeopardized if the educational mission is compromised by the current changes in academic medical centers.

University of Missouri-Kansas City School of Medicine

Principal Investigator: Louise E. Arnold PhD, David T. Stern, MD, PhD
Grant Amount/ Duration: $69,979.00/ 1 year
Project Title: Towards Assessing Professional Behaviors of Medical Students through Peer Observation
[–] Click HERE to close Abstract

Abstract

Rationale
Medical educators have made significant progress in the reliable and valid assessment of medical students' knowledge and clinical skills. However, knowledge and clinical skills alone do not a physician make. Professional behavior is the essence of physician-hood. Although there is agreement on the definition of professional values, the assessment of professional behaviors remains problematic. However, peers offer a unique perspective on fellow students' professional behaviors. They are routinely in positions to make observations that are often not accessible to faculty and resident supervisors. Preliminary studies of peer evaluation have shown their potential for reliable and valid assessment. Yet, some peers are reluctant to participate; and their reticence may compromise the reliability, validity, and utility of their assessments. On the other hand, if we can understand what behaviors peers observe in each other, how they judge these observations, and what actions they take in light of their judgments, then we can devise a system where peer observations can contribute to more reliable and valid evaluation of professionalism.

Objective
This year-long research project will lay the foundation for the incorporation of peer observations into the larger context of the assessment of professionalism. At the completion of this research, we will have designed one or more empirically-based systems for peer observation of medical students' professional behaviors. Elements of the system(s) will include: 1) examples of medical students' professional and non-professional behaviors that peers commonly observe, 2) students' perceived responsibilities for reporting the behaviors they observe, and 3) students' perspectives on the acceptable and unacceptable conditions for reporting their observations, including the uses that peer observations would serve.

Contribution to Assessment
Although there is empirical data to show that peer evaluations have some degree of reliability and validity, medical students' perspectives on peer assessment have been ignored in the development of peer evaluation instruments and the conditions under which the instruments have been administered. Without student input into the process of peer evaluation, the likelihood of increasing the reliability and validity of their assessment of medical students' professional behavior will be compromised. Thus, on the way to producing psychometrically acceptable methods and protocols for evaluating professional behavior, we pause to learn about medical students' views about peer observation and peer evaluation of professionalism.

Activities
Clerkship students at two medical schools will be invited to participate in 12 in-depth focus groups. The discussions will (and can, according to preliminary focus groups) elicit students' observations of peer professional behavior, their sense of responsibility for providing feedback, and the conditions under which they would feel such observations could be incorporated into acceptable peer feedback. Transcriptions of the focus groups will be subjected to qualitative analysis, using the principles of grounded theory. The reliability of coding will be assessed, and the opinions of student participants will be sought to verify the investigators' interpretations of the focus group material. Based on the analysis of these initial focus groups, one or more systems for observing, reporting, and using peer observations will be constructed in detail. Perceptions and opinions of these systems will be sought via a survey administered to all clerks at the two schools. Responses to the survey will be used to generate a final set of recommendations characterizing the most acceptable system(s) for peer observations of professional behaviors among students. A key next step -- ascertaining the reliability and the external validity of the system(s) -- would await future funding.

Stanford University

Principal Investigator: Parvarti Dev, PhD
Grant Amount/ Duration: $70,000.00/ 2 years
Project Title: Objective Assessment of Physical Examination Skills Using Simulators: Biocomputational Methods in Analysis of Electronic Performance Data
[–] Click HERE to close Abstract

Abstract
We have developed a method of instrumenting teaching-mannequins such that physical examination performance can be captured and measured during clinical simulations. In our initial evaluation of the E-Pelvis, a pelvic examination simulator, we found that meaningful measures of performance could be extracted from large volumes of electronic performance data generated during simulated clinical examinations. In addition, we have also found that capture and analysis of performance data collected