
Written and designed by the staff of the Center for Teaching and Learning. Reproduce with permission only.
Assigning grades is one of the most difficult tasks you will face in teaching. Teachers must combine a variety of disparate elements of student performance into a single course grade: verbal skills, ability to memorize, retention of factual information, ability to synthesize material, ability to make reasoned judgments about the material, etc. It is difficult to devise a grading method in which the final grade fairly reflects all aspects of a student's performance. Within certain limits, every teacher is allowed to develop his or her own grading system, and because standards are very personal and idiosyncratic, grades are not a currency that has a uniform value--an "A" from one teacher may be the equivalent of a "C" from another. Part of the problem with grading arises from the fallibility of the tests we use to measure student performance. Few teachers are confident that they can assess student achievement accurately and consistently, and the effectiveness of any grading system is highly dependent upon the accuracy of the tests on which it is based. Nonetheless, there are some guidelines that will help you devise a fair and reasonably accurate system of grading.
You should first investigate your department's policies on grading practices. Even if there is no written policy, there may be traditions and unwritten rules regarding grading, and your grading system will need to conform to these rules. If you are a TA grading for a professor, he or she should explain the policies and procedures to you in complete detail, and if any problems develop in the system you should let the professor know immediately (especially if students express confusion or dissatisfaction with the grading scheme).
It helps to make a distinction between grading and other forms of feedback. A grade is a certification of competence that should reflect, as accurately as possible, a student's performance in a course. If this goal is achieved, grades will have the same value from semester to semester and from year to year. When we include grading elements that are difficult to measure accurately (such as effort or participation) we reduce the strength of the relationship between grades and academic achievement. Furthermore, when we use grades for reward or punishment, give extra credit for additional work, or grade on attendance, we contaminate the meaning of grades and reinforce the students' belief that a course grade has less to do with academic performance than with fulfillment of arbitrary requirements.
We must give students feedback in many of these areas of behavior, but using the grading system for this assessment is inappropriate. Moreover, we often complain that students are excessively grade-oriented, but attaching a grade value to every aspect of student performance reinforces our students' preoccupation with grades. Avoid using grades as incentives for performance and seek out non-graded methods for motivating students. For example, verbal rewards in class, individual conferences, and written critiques can provide feedback without contaminating the grading system.
A good grading system must meet three criteria: (1) it should accurately reflect differences in student performance, (2) it should be clear to students so they can chart their own progress, and (3) it should be fair. Performance can be defined either in relative or absolute terms (comparing students with each other or measuring their achievement against a set scale), and each system has its defenders. Whichever grading scheme you use, students should be able to calculate (at least roughly) how they are doing in the course at any point in the semester. Some relative grading schemes make it impossible for students to estimate their final grades because the cutoff points in the final distribution are not determined until the end of the course. A complete description of the grading system should appear in the course syllabus, including the amount of credit for each assignment, how the final grades will be calculated, and the grade equivalents for the final scores. Also, students should perceive the grading system as fair and equitable, rewarding them proportionately for their achievements. From the standpoint of measurement, many different kinds of assignments, spread over the entire semester provide a fairer estimate of student learning than one or two large tests or papers.
Relative (norm-referenced) grading systems are probably the most widespread in higher education. In relative grading, students are in competition with one another for a limited number of grades in each category, and a student's grade is based on his or her relative position in the class. By contrast, absolute (criterion-referenced) systems use an unchanging standard of performance against which student performance is measured, so a student's grade is related to his or her achievement of particular levels of knowledge, skills, and understanding. No grading system is foolproof, for the integrity of any system depends on the teacher's ability to devise valid and reliable measurements of student performance. Measurement error is the greatest hindrance to effective grading.
Relative grading is based on two assumptions: (1) one of the purposes of grading is to identify students who perform best against their peers and to weed out the unworthy, and (2) student performance more or less follows a normal distribution--the famous bell-shaped curve. Teachers who use relative grading point out that these systems correct for unanticipated problems (e.g. widespread absences due to a flu epidemic, tests that are too hard or too easy, or poor teaching) because the scale automatically moves up or down. Students like relative grading for the same reason.
The Curve One of the most common relative grading systems is "grading on the curve." Use of the curve as a grading model is based on the discovery, earlier in this century, that IQ test scores over large populations approximate a normal distribution (Figure 8). Although it is true that the larger the class, the more likely that student performance will begin to look something like a normal curve, the assumption that performance is normally distributed is usually unjustified, even in large sections. In the first place, college students are a highly selected group, not representative of the general population with respect to background or intelligence. Second, we cannot be sure that our tests accurately measure student achievement--even standardized exams are suspect in this regard.
Insert Figure 8 here.
Fortunately, few teachers adhere to a strict normal distribution, since it will fail a fixed percentage of the class and award "A's" to a fixed percentage, without reference to the overall level of performance. Forcing students into this scale tends to wreak havoc with their motivation. Consequently, many people use a "skewed curve" in which the distribution is shifted upward slightly, resulting in fewer grades below "C" and more in the "B" category. However, few teachers base their modified curves on statistical principles or cumulative performance data; they simply select a distribution that "looks right." Typically, the rationale for grade cutoff points is based on tradition rather than on analysis of student performance over time. The major problem with any curve is that one cannot be sure that differences in performance are real or simply artifacts of the distribution--was the performance of the top 5 students who got "A's" substantially different from that of the 15 who received "B's?"
The Standard Deviation Method Statistically speaking, the soundest relative grading system is the standard deviation method. In this system student grades are based on their distance from the mean score for the class rather than on an arbitrary scale. To calculate the standard deviation, the teacher creates a frequency distribution of the final scores and identifies the mean (average) score. Using the formula in Figure 9, the standard deviation is computed so that cutoff points for each grade level can be determined. Spreadsheets can be programmed to perform the math automatically.
Figure 9 here.
S.D. = NSX2 - (SX)2
N(N - 1)
Where X = mean of final scores
SX2 = sum of all squared final scores
(SX)2 = squared sum of all final scores
N = number of final scores
Cutoff points for "C" grades range from one-half the standard deviation below the mean to one-half above. Adding one standard deviation to the upper "C" cutoff will yield the "A-B" cutoff point, and subtracting one standard deviation from the lower "C" cutoff will provide the "D-F" cutoff point (see examples in Figure 10).
Insert 10 here.
Standard Deviation Figure 10
Class I Class II
Mean of final scores = 79.2 60.76
Standard deviation = 12.79 9.85
Upper "C" cutoff = 85.6 65.69
Lower "C" cutoff = 72.8 55.83
"A/B" cutoff = 98.4 75.54
"D/F" cutoff = 60.0 45.98
Although the standard-score method of computing grades is statistically superior to other relative grading methods, there are several cautions to keep in mind. As with other relative grading schemes, capable students in "high achievement" classes may be unfairly penalized and poor students in "low achievement" classes may unfairly benefit. The method requires some knowledge of statistics and the mathematical transformations involved so you are not working blind.
T Scores Some teachers use a related method to transform raw test scores to standardized scores before averaging, but prefer to use another method for determining final grades. Averaging tests with different means and standard deviations is akin to adding apples and oranges, and raw scores cannot be weighted or averaged without introducing a bias. Transformation to standard scores adjusts for differences in means and standard deviations and thereby preserves the mathematical integrity of each score. "T scores," which have a mean of 50 and a standard deviation of 10, are often used for this purpose, and spreadsheets can be programmed to make the transformations automatically. Of course, you must explain to students how their scores are being transformed so they won't be confused about their averages.
The Gap Method Another relative grading scheme is the "gap method," but it is difficult to defend on the basis of statistics or measurement theory. In this system, students' total course scores are arranged in ascending order and the teacher looks for naturally-occurring gaps in the distribution of the scores. Unfortunately, the gaps may not reflect real achievement differences but simply chance occurrence, and they may not appear at reasonable points in the distribution. The primary advantage of the gap system is that there are fewer complaints about borderline grades, since students are unsophisticated about grading systems and will likely accept the gaps as proof of significant differences in performance.
Absolute grading is based on the idea that grades should reflect mastery of specific knowledge and skills. The teacher sets the criteria for each grade, and all students who perform at a given level receive the same grade.
Percent of Total Points The simplest absolute grading scheme is "percent of total points possible." The teacher decides on the total number of points that a student could earn in the course and sets arbitrary achievement levels based on the total. The cutoff for "A" grades might be 90%, for "B's," 80%, and so forth, and it is assumed that a student who makes 83% knows 83% of the material. If every student scores above 90%, they will all receive "A's." Although this method does provide clear performance targets for students, there are several problems associated with it. First, the rationale for the cutoff scores is usually murky and is often based on intuition rather than analysis. Second, the system is based on the assumption that the teacher can construct valid, reliable exams and assignments at consistent levels of difficulty. Third, some teachers apply the same performance scale to every evaluation component, a practice which does not take into account the variability of assignments or adjust for particularly difficult or easy assignments. Finally, some students may achieve a high number of points simply by doing well on many small, less important assignments.
Objective-Based Grading Objective-based grading is perhaps the most sophisticated kind of absolute grading because the method attempts to equate grades with different kinds of performance. In all the grading systems reviewed above, the teacher assumes that students who receive good final grades have learned the more important material and mastered more complex levels of thinking, but this assumption may not be valid. For example, students who do very well on objective exams and poorly on written assignments may earn a respectable final grade, but may not have mastered important intellectual skills that the teacher had in mind. The objective-based grading method takes into account both the amount of material students learn and the level of cognitive complexity they achieve.
To use objective-based grading, the teacher must first review the kinds of knowledge and skills that are implicit in the course and make them explicit as course objectives. (Refer to the section on course design for a more detailed treatment of course objectives.) You must identify two kinds of outcomes: minimal and developmental objectives. Minimal objectives are statements of essential course outcomes and basic skills; developmental objectives reflect higher-order cognitive processes such as critical thinking, decision-making, and complex problem solving. For example:
Minimum Essential Objectives
The student will be able to:
Developmental Objectives
The student will be able to:
It may be easier, at least initially, to measure achievement of minimal and developmental objectives using separate exams and assignments for each type of objective. This technique will simplify record-keeping and help you focus more sharply on the different kinds of tasks that are appropriate for assessing the two types of objectives. Test questions and exercises for minimal objectives are relatively easy to create because they assess basic knowledge and well-rehearsed skills. Measuring developmental outcomes is more difficult, for you must not only master the classification systems for complex thinking and reasoning skills but also must be able to devise assignments that measure these skills. There are a number of recent publications that you can consult for suggestions about testing thinking skills, and we have included several in the bibliography at the end of this section. Some writers suggest that novelty is one element common to higher-order learning tasks and therefore assignments that require students to apply their thinking skills in new ways or situations will test complex reasoning.
If your tests and exercises assess both kinds of objectives with reasonable accuracy, you can set performance standards and grade equivalents on a scale. In the example in Figure 11, to pass the course students must master 80% of the minimum essential objectives and 50% of the developmental objectives. Obviously, setting these cutoff points must be done carefully, taking into account the difficulty of the tests and assignments and student performance in previous classes or other sections of the same course. If using this kind of scale seems too difficult, you could use the "total points possible" system instead. By awarding more points for tests and assignments on higher-level objectives and fewer points for tasks on less important objectives, you would still reap some of the advantages of the objective-based method.
| Grade | Essential Objectives | Developmental Objectives |
|---|---|---|
| A | 90% or more | 85% or more |
| B | 90% or more | 75 to 84% |
| C | 80% or more | 60 to 74% |
| D | 80% or more | 50 to 59% |
| F | less than 80% | less than 50% |
No single grading system will be appropriate for all courses at all times, and teachers must be sensitive to differences in students and subject matter when choosing a grading system. It takes time to develop realistic expectations about student performance, and the best teachers reexamine their grading assumptions to verify that their systems are valid. Finally, the accuracy of any grading system depends upon the validity and reliability of measures used to assess student performance, so improving the quality of exams and course assignments will improve the accuracy of the final grades.
Grade Appeals Page 158 of the 1988-1990 issue of the Undergraduate Bulletin contains the following:
If students wish to protest a course grade, they must first attempt to resolve any disagreement with the course instructor. If they fail to reach a satisfactory resolution, they may appeal the grade in accordance with the following procedures. They may submit a written appeal with any relevant test papers, term papers, etc. to their academic dean not later than the last day of classes of the next succeeding semester. The dean will refer their appeals to the administrative board and the chairman of the department concerned. The department chairman will appoint a committee to consider the appeal and will make a recommendation to the administrative board based on the committee's findings. The decision of the administrative board in such cases is final.
Although most grade appeals are denied, the process is lengthy and usually traumatic for all concerned, so take measures to avoid the problem altogether. The best insurance is to develop specific grading criteria, describe them fully in your syllabus, and follow them scrupulously. Students should be able to calculate (at least roughly) how they are doing during the semester so they will not be surprised by the final grade. Make it clear that students should come to you if they have questions about their course grades. Some teachers schedule mid-semester meetings with students to discuss their progress, their grades, and problems they are experiencing in the course.

home / teaching at carolina / publications / email
Last updated: January 30, 2001