Grading student exams fairly and effectively remains a challenge for many professors. Maintaining consistency among students on the same exam can be accomplished by using grading rubrics, grading the same question for all students at the same time, and giving similar questions each semester. However, there are still natural tendencies and preferences that affect how an individual professor grades. The objective of this research was to quantitatively assess how professor grading biases influenced exam scores in the same upper level course offered at multiple universities.
The course selected for analysis was an introduction to the design of reinforced concrete structures, a common course in many civil engineering curricula. Three professors at three different universities taught similar topics using their unique teaching styles and methods. During the semester, the same exam questions were posed to the students at each university. To understand how grading biases propagated throughout the exam questions, each of the professors re-graded the questions from all three universities at the conclusion of the course after the student identifiers were removed. The final scores from each professor were compared for 35 unique questions completed by up to 57 total students. Prior to re-grading the exams, the point value for each question was agreed upon, but there was no common grading rubric and no communication between professors about grading methodology.
Differences were measured among the scores that were assigned by each professor. Statistical differences in the scores for each question were assessed using Tukey’s method to compare individual means in the analysis of variance. In approximately 66% of the individual problems, the Tukey method revealed that there was no statistical difference in the grades (p-values > 0.05). Half of the problems that had statistically different grades (p-values < 0.05) had point totals of five or less (on a 100 point exam), which meant they were short answer or simple problems. While the differences in grading were minor on most individual problems, these differences perpetuated for each grader. A sum of the average grade for each professor over all 35 problems indicated a difference in total grades that ranged from 2 to 7 percentage points. This study showed that, while grading bias does occur, there were limited statistical differences in most questions among the three professors in this study.
Are you a researcher? Would you like to cite this paper?
Visit the ASEE document repository at
for more tools and easy citations.