Background image

Medical Education
Graduate Medical Education

Medical Knowledge


Multiple choice examinations (MCQ) of medical knowledge are available from specialty boards, professional societies and program director organizations. In many cases they are required as in-service examinations to track residents’ progress and set the stage for written board certification examinations. Although MCQ’s provide valid assessments of medical knowledge, oral examinations and presentations of clinical cases are considered better measures of clinical reasoning.

Reliability and Validity

Generally MCQ’s are developed to achieve high internal consistency reliability for their subscales. Content validity is maximized by assuring that the examination items sample the full range of core knowledge. Construct validity is demonstrated when higher level residents have a greater % correct than lower level residents. Although performance on one MCQ tends to predict performance on subsequent ones, evidence is mixed correlating MCQ performance with other aspects of medical knowledge such as clinical reasoning.


  • Timing: Usually once per year
  • Who Performs: generally a secure examination administered by staff according to guidelines of the in-service examination
  • Format: Each item contains an introductory statement or ‘stem’ followed by four or five response options, only one of which is correct. The stem is usually a patient case, clinical findings, or displays data graphically. A typical half-day examination has 175 to 250 test questions.
  • Scoring Criteria and Training: Completed exams are generally returned to the organization that provides the test for scoring. Score reports can include raw % correct, scores standardized for PGY level, and subscores in key content areas.
  • Documentation: Achievement of the medical knowledge competency must occur at least twice a year at the semi-annual review meeting. MCQ performance can inform one of the meetings, and other knowledge assessments (global evaluations, assessments of clinical reasoning, progress toward reading goals from last year’s MCQ) can inform both meetings.

Uses of the Data

  • Comparing the test scores on in-training examinations with national statistics can serve to identify strengths and limitations of individual residents to help them improve.
  • Summative Decisions: MCQ performance falling short of a minimum passing threshold could delay or prevent a resident from advancing or graduating. Generally, however, such decisions should be based on overall assessments of medical knowledge including clinical reasoning.
  • Remediation Threshold: Programs should communicate what performance on the MCQ would require remediation. The threshold for remediation may be determined by a national or local standard for passing performance or a score that portends difficulty passing the board certification examination. Generally a specific program of study would be established to close gaps in knowledge, and progress would be assesse+d short-term using written or oral examinations. [N.B. because in-service examinations are often administered only once annually, programs may need to rely on other measures of progress].
  • Comparing test results aggregated for residents in each year of a program can be helpful to identify residency training experiences that might be improved.