Case Study Analyses
A case study is a qualitative research method that involves an in-depth examination of a report of an instance or case that represents important issues. This technique provides a systematic way of looking at events, collecting data, analyzing information, and reporting the results. Case study analysis can be used to evaluate a program by monitoring program participants to determine if they can apply content from the program to their analysis of the case. Cases can be selected to be (a) illustrative of common issues, problems or features, (b) exploratory when confronted with uncertainty about program operations, goals, and results, in an attempt to identify salient issues, (c) critical incident cases, which illustrate important or unique situations, (d) program implementation to determine if the implementation of a program or process is consistent with the original intent, or (e) program effects, to determine the impact of programs and provide inferences about reasons for success or failure. For more information, consider Yin, R., Case Study Research. Design and Methods (3rd Edition). Applied social research method series Volume 5. Sage Publications. California, 2002.

Checklists document the basic component parts of a target behavior. The checklist can include both essential elements and desirable (but not required) elements as well as specific undesirable elements if appropriate. The checklist can be used to assess the target behavior of an individual, teams or organizations. Often, checklist items are recorded as Yes or No meaning that the component part occurred or did not occur. Sometimes, depending on the types of behaviors being measured, partial credit can be given if appropriate to the measurement task, such as when the component part is attempted but not successfully completed. More subjective aspects of performance are frequently assessed using rating scales (see below). For guidance on developing checklists, see the Checklist Development Checklist at:

Compliance Reviews
Reviews of compliance are generally implemented to determine if individuals or institutions are adhering to standards or regulations for the conduct of research. Such reviews may establish if the organization has an adequate infrastructure for evaluating compliance, including qualified individuals to perform the necessary tasks or it may examine the processes that are described in documents explaining institutional procedures regarding compliance. The review could also involve periodic summaries that report instances of noncompliance or incidents involving compliance. Multiple data sources can be used in conducting a compliance review. All research universities should have offices and websites explaining expectations and processes for compliance.

Critical Incident Reports
Critical Incident Reports can be used to collect information about processes or behaviors that have critical significance and meet methodically defined criteria. These observations are then kept track of as incidents, which are then used to solve practical problems and develop broad psychological principles. Information can be collected from written records or databases, as well as through personal interviews or questionnaires with individuals. Information is collected by any of the aforementioned methods to identify the strengths and weakness of the team or organizational performance during unusual or important procedures (critical incidents). Organizational or team performance is assessed through the analysis of the information provided by records, interviews and/or survey data. More information can be found at:

Essay Questions
Essays are best used for evaluating higher order cognitive tasks such as synthesis, problem solving, application of principles and personal reflections. Because answering and scoring essay questions can be time consuming and difficult, educators generally use the essay only when other assessment methods are not valid measures. Essays can be long or short answer as well as structured or unstructured. In the structured essay learners are more restricted in the form or the scope of the response. The unstructured essay question can have no boundaries set for the response. The challenge of the essay is in the grading or scoring. Two common approaches are the analytic method also known as the point score method and global scoring. In point method the score reflects specific points that can be earned based on content, analysis quality, support of statements to name a few. In the global scoring method the evaluator provides a holistic score and developing a model answer for comparison. For more information, see Linn, R. L., & Gronlund, N. E. (2000). Measurement and assessment in teaching. Upper Saddle River, NJ: Prentice Hall; Chapter 32: Essay Questions and Variations, in Amin, Z. & Eng, KH (2003). Basics in Medical Education. River Edge, NJ; World Scientific.; or check the following website:

Focus Groups
A focus group is a form of qualitative research in which a group of people are asked about their attitudes or opinions about a product or service, such as a training program, web-based resource, interactions with an IRB office, etc. Questions are asked in an interactive group setting where participants are free to talk with other group members. Focus groups allow interviewers to study people in a more natural setting than a one-to-one interview. Focus groups have a high apparent validity—since the idea is easy to understand, the results are believable; they are low in cost, since one can get results relatively quickly, and can increase the sample size of a report by talking with several folks at once. However, focus groups also have disadvantages: The researcher has less control over a group than a one-on-one interview, and thus time can be lost on issues irrelevant to the topic; the data are tough to analyze because the talking is in reaction to the comments of other group members. For more information, see Marshall and Rossman, Designing Qualitative Research, 3rd Ed. London: Sage Publications, 1999.

Observational and observational ratings or checklists provide a source of information based on actions. Rather than asking individuals how they might behave in a certain situation, observational data provide information on how individuals actually behave in a given situation. The situations to be observed and quantified can occur in real life settings such as offices or classrooms, can be the result of role plays, or can simulated encounters. Several different kinds of information collection strategies can be used in to capture observation-based data (see Checklists, Role-Playing and Rating Scales). For more information about observational data collection, consult Chapter 31: Observations of Behavior and Sociometry in Kerlinger, F.N. (1992), Foundations of Behavioral Research, Third Edition. Fort Worth, Tex. : Harcourt Brace College, or Chapter 17: Observations in Cohen, L., Manion, L. & Morrison, K. (2000). Research Methods in Education. Routledge: New York.

Pre- and Post-Tests
Knowledge can be measured using tests. Frequently, the same test is administered at the beginning of a program and then repeated at the end of the program to determine the extent to which participants changed as a result of the program. Comparing the test scores with program content can also be used to identify strengths and weaknesses of the training program. This is especially true when aggregating the test results across multiple offerings of the program and identifying where the program was successful or not in promoting changes in knowledge. Tests are most often administered in multiple choice question and short-answer question formats. For more information, Haladyna TM. Developing and validating multiple-choice test items. Hillsdale, New Jersey: L. Erlbaum Associates, 1994; see also Chapter 31: Essay Questions and Variations, in Amin, Z. & Eng, KH (2003). Basics in Medical Education. River Edge, NJ; World Scientific.

Program Institutionalization
Program institutionalization can be an important marker revealing the extent to which a desired change becomes permanent. For example if an institution requires students to take a course in the ethical conduct of research as part of a graduate curriculum, that may be a marker of institutionalization. When universities require all research departments to have explicit procedures and policies for data ownership and storage, this could be a measurable outcome for program institutionalization. Program institutionalization is validated by evidence that important resources or components of a program are on-going and are likely. This is particularly true for grant-funded projects' there is often a concern about whether the products of the project will continue once the grant has been finished. Institutionalization can be viewed as requiring resources, infrastructure and recognized need or value for a change or innovation to become permanent. More on this is available at: Ross, RD. The institutionalization of academic innovations: Two models. Sociology of Education. 1976; 49: 146-155.

Rating Scales
Rating scales are an extension of the use of checklists (see Checklists, above). Like checklists, the scales represent components of behaviors or processes of interest. As most frequently implemented, instead of each item being assessed as Yes/No or Present/Absent, the components are rated on more subjective qualities. Thus, when teaching a new skill such as obtaining informed consent from a research subject, a participant can demonstrate the skill and the evaluator can grade the performance of the participant on qualities such as rapport with the subject, clarity of information presented, adequacy of answers to subjects' questions, etc. Each item could be rated on a five-point scale such as poor, fair, good, very good, excellent. Rating scales are often used in conjunction with checklists, with objective content assessed with the checklist and more subjective aspects assessed with the rating scale. Additional information about developing rating scales can be found at:

Record Reviews and Audits
Record review can provide evidence about institutional decision-making, follow-through in implementation of policies and procedures as well as the appropriate use of facilities and resources. Documentation can also include meeting minutes, attendance records and sign-up sheets, logs and databases (such as information requests, applications received, classes offered, etc). Essentially, any standard information collected as part of operations, quality control or documentation of services can be used as part of a record review to determine outcomes related to changes in practices or changes in service utilization. Data collection is based on a coding form that was developed using predefined criteria to abstract information from the records. The following reference focuses on medical records, but many of the principles are important and can be used in educational and other settings: Tugwell, P. & Dok, C. Medical record review. (Chapter 8). In V. Neufeld & G. Norman, (Eds.), Assessing Clinical Competence. Springer Publishing Company: New York, NY, 1985.

Role Playing
In role playing, participants adopt and act out the roles as part of a simulation experience. It is a form of rehearsal, and gives the learner an opportunity to practice new behaviors. It can also be used to encourage individuals to confront unfamiliar or uncomfortable situations or issues as a means of creating awareness and/or changing attitudes. A participant could be asked to interact in the role of a mentor having to problem solve around the poor performance of a student. Role play can be used to provide practice for content of an educational program or for evaluation by determining if participants spontaneously demonstrate knowledge or skills provided during the program. For more information see Chapter 21: Role-playing in Cohen, L., Manion, L. & Morrison, K. (2000). Research Methods in Education. Routledge: New York.

Structured Interviews
There are two general approaches to collecting information about individual's beliefs, knowledge, experiences and behaviors: the interview and the questionnaire. Both methods can be highly structured which will increase the reliability of the approach. Structured interviews have the advantage of allowing the interviewer to clarify and amplify responses to questions resulting in a richness of data. Their disadvantage is the inefficiency of time required to conduct the interviews. Interviews can be conducted in person or over the telephone. In many cases it will be necessary to train the interviewer to ensure that all interviews are consistent. A good resource for evaluators who are designing interviews is: Hyman, H. Interviewing in Social Research. University of Chicago Press: Chicago, 1975.

Surveys of program participants can be used to a wide array of program outcomes, such as program satisfaction, program quality (such as presenters, hand-outs, time of day, content and/or presentation format), attitudes and values. Most frequently, surveys ask participants to rate their satisfaction with a specific feature of the program using rating categories (e.g., poor, fair, good, very good, excellent) or ratings of agreement (e.g., strongly disagree, disagree, unsure, agree, strongly agree). Multiple ratings can be averaged to compute satisfaction indices for program components. Surveys distributed to participants can be returned either anonymously or with identifiers. The surveys can be distributed on paper or using a variety of web-based methods. For more information, see Dillman, D., Mail and Internet Surveys: The Tailored Design Method (2nd Edition). John Wiley and Sons: New York, 2000.

Additional Resource
This site, maintained by the Center for Education Research and Evaluation of Columbia University Medical Center, provides an overview of methods commonly used in the assessment of learner outcomes.