
Written and designed by the staff of the Center for Teaching and Learning. Reproduce with permission only
Student Evaluation
of Teaching
September 1994
Student evaluations are the most commonly used method of assessing an instructor's effectiveness in the classroom. However, student ratings do have their limitations, especially when they represent the only method of teaching evaluation used in tenure or promotion decisions. Departments should gather evidence of teaching effectiveness from a variety of sources. These may include: peer evaluations, letters from students, syllabi and instructional support materials, and individual teaching portfolios. Student evaluations, if they are properly constructed, should be part of this mix because they offer an indispensable perspective on an instructor's effectiveness. Moreover, student feedback is an important tool for individual instructors who wish to improve their teaching.
There is a significant body of research that can help departments establish guidelines for constructing, administering and interpreting student evaluations.
One objection to student ratings is that they are not valid measures of teaching effectiveness; that students are not able to assess good teaching and therefore evaluations represent nothing more than a popularity contest. However, the results of a number of studies indicate that students and faculty offer very similar responses when asked to rank aspects of teaching in terms of their relative importance. Both groups agree that the most important indicators of good teaching include: teachers' preparation, organization, clarity and comprehensibility as well their ability to gauge class level and progress. Interestingly enough, students and faculty also agree that the instructor's personality is relatively unimportant. Furthermore, research has shown a significant correlation between instructors' self-ratings of their effectiveness and student evaluations. There are also significant correlations between student evaluations and the ratings of trained observers.
If student evaluations are valid, we would expect that students who learned more would rate their instructor as more effective. An analysis of 41 studies found that there was, in fact, a statistically significant, positive correlation between student ratings of teaching and student achievement. Survey items concerning overall course quality and overall instructor effectiveness showed the highest correlation with achievement. As one might expect, the correlation with items referring to specific teaching skills varied considerably.
Research also indicates that student evaluations are reliable; they yield consistent results that are stable over time. A study in which ratings were collected on the same instructors teaching the same course in four consecutive years showed a consistent pattern: the instructors received similar ratings year after year (i.e., evaluations from each class indicated similar strengths and weaknesses). Moreover student evaluations seem to be stable over time: there is a high correlation between alumni ratings and those of current students, and studies that ask alumni to re-rate professors after graduation show a very high level of stability as well.
From what we now know, it seems that student evaluations are not affected appreciably by the personality of the instructor (popularity does not necessarily lead to higher overall scores), the gender of the instructor or the student, or the time of day a class is offered. The instructor's rank does seem to have some effect on the ratings: TAs tend to receive lower ratings than faculty, and first-year faculty receive lower scores than their more experienced colleagues. Ratings seem to go down as students have spent more time at the University (freshman tend to rate teachers higher than sophomores, etc.). Whether a course is taken as a requirement or an elective also seems to play a role. Overall, elective courses and courses in the major receive higher ratings than required ones. Finally, it seems that instructors in certain disciplines are rated, on average, higher than others.
To insure that student ratings function well within an evaluation system, departments should follow standard procedures in the creation, administration and interpretation of rating forms.
The questions asked on student evaluation forms should correspond to the aspects of teaching that the department considers important. Various studies have attempted to categorize the most important characteristics of good teaching. See Figure 1 for a list of these characteristics, along with some sample evaluation items.
Each class period was carefully
planned.
Lectures were well
organized and easy to follow.
The instructor encouraged students
to ask questions.
The instructor
treated students with respect.
The goals for each class were
clear.
The instructor used
examples to clarify difficult points/concepts.
Grading criteria were clearly
explained.
The instructor offered
useful feedback on the assignments/tests.
The instructor varied activities
over the course of the semester.
The
instructor's use of lecture versus discussion seemed appropriate.
The readings were relevant to
the rest of the course.
Audiovisual
materials/computers/films contributed significantly to my learning.
Overall, this course is a valuable
learning experience.
Considering
the scope and limitations of the subject and the course, the instructor
is an effective teacher.
The nature of the rating scale will also play a role in how useful the data will be. Studies have found that reliability increases if each number in the scale is defined verbally. You may also wish to include room for students' comments, which can supply specific suggestions for building on strengths and improving on weaknesses. Space for such comments can be provided after each item on the evaluation and/or in specific open-ended questions. The following format has been used by several departments at UNC:
Circle the number that reflects your rating on the following scale: 1 = strongly disagree, 2 = disagree,
3 = neutral, 4 = agree, 5 = strongly agree.
sd d n a sa
1. Grading criteria were carefully explained 1 2 3 4 5
COMMENTS: ______________________________
Once your department has reached consensus on which questions to include, the form should be pilot-tested for one semester to identify any problems and to provide an opportunity for some statistical testing of the form, including factor analysis and tests for internal reliability. Such testing requires statistical expertise, and CTL will be glad to help you coordinate the process.
The way in which the evaluations are distributed and collected also plays a role in whether the data will be accurate.
The following guidelines should help insure that data from student evaluations is used fairly within a department's evaluation system:
Studies indicate that student ratings do lead to improvement if teachers study them carefully. They also suggest that the degree of improvement can be much greater if the teacher shares the results with a colleague or a teaching consultant. CTL staff provide this kind of consultation to all faculty and TAs in Academic and Health Affairs.
One major drawback to end-of-the-semester evaluations is that students enrolled in the class do not benefit from the results. There are a number of ways for instructors to gain useful insights from students while there is still time to make adjustments in a given course.
The most immediate gauge of teaching effectiveness is how much students learned from an individual class session. One method to measure their learning is the "minute paper," where the instructor gives students a couple of minutes to answer a question in writing. Questions might include: "What was the main point of today's lecture?" "What one point is still unclear?" You would not need to respond to each student's answer; rather, you could clear up any misconceptions or answer specific questions that seem particularly important.
You might consider having a suggestion/question box for students to offer their thoughts, concerns or critiques at any point in the semester. A high-tech alternative would be to set up an account on e-mail as a type of electronic suggestion box. This technique is particularly useful for large classes, and you could discuss any relevant questions or comments with the whole group.
If you are interested in a broader perspective, you could try a very simple form of early feedback. Ask students to write down 3 things they like about the course and how it is being taught and 3 things they would like to see changed, along with specific suggestions for changing them. Once you have collected and categorized the responses and determined the major trends, you can report the results back to the class. While you are not obligated to adopt every suggested alteration, it is an exercise in frustration to ask for this type of feedback if you have very little latitude or authority to innovate.
You can also obtain student feedback by using a short rating form at any point in the semester. This fall, CTL will begin offering individualized rating forms to any instructor who wishes to gather information for improvement. You will be able to choose from a bank of over 250 items for inclusion in your evaluation instrument. After your class completes the evaluation, the forms will be computer scanned, and you can discuss the results with one of our consultants.
Student feedback can lead to teaching improvement if several criteria are met. First, the teacher must believe the feedback is valuable and valid. Second, the teacher should have some motivation to improve, either intrinsic or extrinsic. Third, the feedback should provide specific areas to work on and suggestions for improvement.
CTL's staff has worked with several departments (and many individual instructors) on teaching evaluation and we would be glad to consult with others who wish to construct a student rating form or a complete evaluation system. The department (or individual instructor) would be responsible for making decisions about the final product and the specific methods to be used for collecting and interpreting data. CTL can help with issues such as establishing consensus, defining some general guidelines for constructing rating forms and providing assistance with other methods of evaluation.

home / fyc index / publications / email