As the public, state, and federal agencies have become more sophisticated consumers of testing services, they have increasingly sought documented evidence of test validity. It is highly desirable for test sponsors to examine carefully the decisions that will be made based upon test results and to determine the types of data that will be required to substantiate the validity of their examination program.
In assisting clients in analyzing and documenting the validity of their examination program, our staff:
In the health professions, achievement testing is used for a variety of purposes:
Content validity is one of the most essential types of validity for all achievement testing. That is, the content areas and competencies covered must be consistent with the purpose of the test and must adequately sample the universe of knowledge of the discipline. It is not always possible to cover all of the topic areas relevant to professional education and practice. Therefore, it is crucial to have mechanisms in place to set priorities and to assure that the most important topic areas are sampled appropriately.
The NBME helps clients assure the content validity of their examination programs and prepares appropriate documentation for review by external agencies. The NBME encourages clients to review periodically the content validity of their examination and to conduct studies to update content specifications as needed.
Specifications regarding test content and evaluation objectives are used by item writers and test committees to develop the appropriate test content. These specifications are also useful for monitoring test content to assure consistent sampling from year to year.
The NBME assists client organizations in the following:
As professional education and practice evolve, there needs to be ongoing assurance that examination content is reviewed and updated appropriately. The NBME has a long history of researching and evaluating technologies for use in assessment. Interpretation of motion studies has become an important skill for many practicing physicians. Still and motion video case presentations can provide an innovative and valid assessment of the specific competencies involved in interpreting motion studies for certification or assessment examinations. Video case presentations can be used as alternatives to, or in conjunction with, paper-and-pencil multiple-choice questions. The NBME is also incorporating still and motion video in computer based simulations.
Content specifications can be described in a variety of ways and often involve multiple dimensions. The NBME offers computerized, three-dimensional graphics which facilitate the review and analysis of content distribution. Our staff and clients have found these graphic capabilities to be particularly useful for--
Some testing methods afford a more effective and efficient testing of certain types of content areas and professional competencies. We carefully analyze content specifications and evaluation objectives and make recommendations to the client regarding the most appropriate testing methods. Where several methods are available, the advantages and disadvantages of each are presented for the client's consideration.
Clients sometimes find that existing test item formats or testing methods do not provide fully satisfactory means of assessing certain evaluation objectives. Such situations often lead to the development of innovative formats. Our staff is sensitive to such opportunities for innovation and is prepared to help clients overcome deficits or limitations in available testing technology. Assistance is provided not only in the development of experimental formats but also in the design and implementation of field studies to assess their potential usefulness.
The NBME encourages test item writers and test committees to experiment with new and improved methods of testing content and evaluation objectives. We use experimental sections for field testing innovative formats while avoiding the use of untried items in calculating official scores of examinees.
This approach offers a cost-effective alternative to separate field testing and allows clients to pretest new items on the most appropriate examinee population. We provide guidance in helping clients determine the feasibility and appropriateness of the experimental sections for their own examination programs.
Well-developed test content specifications are an extremely important component of a valid testing program. These specifications can be achieved effectively only if the individuals developing the test materials are skillful test item writers. We have found that providing test item writers with a systematic training program not only improves their productivity, but also enhances the quality of the test materials they develop.
Test item writers will interact with and learn from one another, and they also have an opportunity to gain valuable insights and advice from more experienced individuals. An integral part of the training process is the use of a written manual on item writing which is tailored to the specific needs of each client group.
We design and implement training sessions in accordance with the client's particular needs and scheduling preferences. All arrangements can be made to train test item writers in conjunction with other regularly scheduled activities-- thus maximizing cost-effectiveness.
Test items that are received by the NBME for review are classified in accordance with the client's content specifications. Periodic comparisons are made between the items received and those required for production of the next examination. The client is advised immediately of any anticipated problems in meeting test content specifications so appropriate steps can be taken to assure the development of needed test materials.
Test items which are sent to the NBME undergo careful review to assure that the content and format are appropriate, the language is clear, and that the items do not contain any item writing flaws. Further critical review by representatives of the client organization is undertaken, in accordance with the client's policies and procedures.
The development of high quality test materials represents a significant investment of human and financial resources. Previously used test materials with known statistical characteristics are an especially valuable resource. Such material is essential for assuring the equivalence of examination content from year to year, the stability of pass-fail standards, and the cost-effectiveness of the test development process. For most clients, we maintain a computerized bank of test items and assist clients in developing an appropriate selection algorithm for the retrieval and reuse of previously administered test items.
The computerized test item libraries provide a secure bank of the client's test items to which only the client and authorized NBME staff have access. Each bank has its own classification system which corresponds to the program's content specifications. As new items are reviewed and approved for inclusion in the client's examination, classification codes are assigned. Items are then entered by NBME staff into the computer bank for subsequent retrieval, review, and reuse.
Once an item has appeared in the client's examination, the item statistics and the date on which the item was administered are appended to the test item in the computerized test item library. These data facilitate selective review of previously used test items for possible inclusion in future test administrations. Items can be selected according to content, date of last use, level of difficulty, and degree of discrimination. Printouts from the computerized libraries not only display the text of the item but also the history of its use and accompanying item statistics.
Composing the examination directly on the computer from the item library not only improves quality control but also reduces the time and costs involved in the composition and proofreading phases of test production. In addition, a scoring key can be transferred automatically to the NBME computerized examination processing system which is used to score and analyze the examination.