NBME logo
Date Updated: November 9, 2011

Health System Reform Policies Appendix

Measures of competence and performance

A wide range of tools is available for the assessment of competence. Since the early years of assessment, there has been a strong focus on assuring mastery of an appropriate body of knowledge; this is, after all, a definitional hallmark of any profession. Due to concerns with reproducibility of scores, essays and oral exams as elements of the medical licensing assessment program began to be replaced almost 60 years ago by multiple-choice exams, which now form the backbone of current competence assessment. A second focus has been on clinical skills, tested originally at the bedside of patients and more recently in a more standardized format utilizing simulated patients. This allows for testing fundamental skills relevant to history taking, physical examination, overall diagnostic assessment, and treatment plan formulation. It also provides the opportunity to test behaviors and skills relevant to communication, interpersonal interactions, professionalism, and cultural sensitivity. In addition, the ability to manage clinical situations has been assessed using various simulation devices and systems.

These approaches have the tremendous advantage of standardization and providing scores that are highly reliable. On the other hand, they all involve testing in artificial settings remote from the field of practice and, therefore, more accurately measure theoretical potential to perform in a real setting rather than actual performance in the real setting.

Measurement of actual performance requires collection of workplace data from the clinician's practice that speak to such issues as diagnosis, clinical decision making and management, procedural skills, and behaviors appropriate for effective interaction with other healthcare providers and with patients and their families and caregivers. Relevant methodologies for assessment of performance include multi-source feedback, peer review, and a variety of measures of processes of care and outcomes that, in comparison with standardized approaches used for measuring competence, are still relatively poorly developed.

There are several challenges associated with performance measurement. The first is simply that of obtaining the relevant data set from an often-chaotic workplace that does not have ready mechanisms for data capture. Where data capture is possible, data are fragmented across many sites and systems. Mechanisms for aggregating the clinical data required for assessment of care processes and outcomes are not yet adequately developed to produce meaningful global measures. Healthcare has been slow to adopt electronic medical records (EMRs), and the systems in place are generally inflexible and poorly positioned to provide on-demand data sets tailored to the individual physician. Having an observer looking over each physician's shoulder on a regular basis is unrealistic, and it is highly inefficient for physicians to spend time collecting, recording, and aggregating the required primary data.

Another limitation of measurement of performance is that the aggregate experience with analysis of such data sets remains limited, and interpretation is often problematic. Examples include the relative importance of different process-of-care measures and confusion around attribution (i.e., responsibility for the outcome in question) of the measure to individual clinicians. Another issue is how to aggregate performance for a given physician across widely discrepant measures. The cost of building systems that would solve these issues is unknown, but likely to be large. Nevertheless, the focus of health system reform on the implementation of technology-based record systems holds the promise of supporting progress in the meaningful assessment of performance.

In the real setting, each clinician's practice is in some respect unique, even within the same specialty and subspecialty. It is consequently harder to standardize measures and provide appropriate benchmarks among physicians. One possibility is to provide repeated measurement feedback that compares the clinician's performance with past performance. Another approach involves a comparison of each clinician's conformance with widely accepted performance norms, although these are now limited in number and of uncertain relevance to clinical outcomes. It is also possible to batch together performance data for groups of physicians with substantially similar practice profiles. However, other than the obvious challenges in aggregation of data at a regional or national level across widely differing systems and practice environments, there is the probability of bias due to unrecognized differences between practice locations, practice population characteristics and severity of illness, and across the different health systems involved. In contrast, measurement of competence using simulations provides a readily controllable means of standardizing assessments that is relatively straightforward. Direct comparison is thus possible with all others who complete the same assessment, either at the same or different times, or at the same or different locations.

There is an emerging body of evidence that competence measures may correlate positively with performance13, although the strength of this association is confounded by limitations in the current state of performance measurement and the few empirical studies addressing this issue. Where such positive correlation exists, straightforward tests may adequately predict performance or at least identify physicians for whom additional measures of actual performance are needed. It will be important to continue to improve the current measures of both competence and performance.

Most current systems of assessment in the health professions are associated with high-stakes decisions, including medical licensure, specialty certification, and clinical privileging (see section “Current system characteristics”). Physicians often devalue feedback from assessment or reject it outright, even where measures of performance in the workplace could potentially have great value as a baseline to drive improvement longitudinally. This argues that performance measures must be shown to be accurate, tailored to the individual physician, and to provide feedback in a form that is “actionable,” provoking both learning and quality improvement. Eventually, routine monitoring of performance will provide an evidence base for continuous improvement and may reveal impending problems or deficiencies before they result in adverse patient outcomes or the need for disciplinary action and while they are still amenable to remediation.

13 Tamblyn R, Abrahamowicz M, Dauphinee WD, Hanley JA, Norcini J, Girard N, et al. Association between licensure examination scores and practice in primary care. JAMA 2002;288:3019-3026.


NBME logo