Stubborn Bias in Student Teaching Evaluations – College of Natural Sciences

TILT Master Teacher Initiative

The Master Teacher Initiative (MTI) is a university-wide program to enhance the quality of teaching within CSU’s colleges and libraries.

Visit TILT’s collection of Teaching Tips and the CNS collection of Teaching Tips

October 28, 2024

Stubborn Bias in Student Teaching Evaluations

For a change of pace, I bring you an article from CHE on student evaluations. We will be requesting input through course evaluations soon. Personally, I feel that surveys on student general impressions are of limited value. I have requested feedback on my courses at two weeks, halfway and the semester end for many years. I ask for feedback on specific course components, activities, the textbook, etc. I agree that mutual repour and respect between the students and the instructor as well as providing an enjoyable and interesting course are important to learning outcomes. Students can also gain agency through course feedback. However, I do not see the point of asking students whether the like me or to rate me as a teacher. Beckie Supiano’s article is a good case in point.

Chronicle of Higher Education Teaching

October 17, 2024

From: Beckie Supiano

Subject: Teaching: What if you can’t remove the bias from course evaluations?

Stubborn bias

When a grassroots committee of faculty members at Hamilton College undertook a project to improve the way teaching was evaluated on campus, they believed it was important to include students’ perspectives on their courses.

But they also knew that student course evaluations have significant flaws. Among them: Gender and racial biases affect the way students evaluate their instructors.

So as part of the effort, some of the Hamilton professors studied strategies for getting rid of those biases. They focused on gender bias in particular because, with fairly even numbers of men and women in their sample of professors, they could get clear evidence. The sample did not have enough faculty of color to evaluate racial bias.

Their findings were recently published in a paper, “Can you mitigate gender bias in student evaluations of teaching? Evaluating alternative methods of soliciting feedback,” in Assessment & Evaluation in Higher Education.

Prior research on whether giving students information about implicit bias can reduce it has produced mixed results. This work has also focused on bias in quantitative survey questions, but the Hamilton group wondered if they could create a completely qualitative form that would mitigate bias. And they were curious if other changes to the course-evaluation process, like delaying when students completed the surveys, might help as well. Their goal: collect evidence to show if these changes to student evaluations could put a dent in bias, as they hoped.

The Hamilton professors designed a randomized, controlled trial to test whether these strategies would reduce bias against professors who are women. Forty professors teaching 210 courses volunteered for the study. The students evaluating them were randomly assigned into three groups. Students in the control group completed a standard evaluation, with both quantitative and qualitative questions, near the end of the semester. One treatment group completed an alternative assessment at the semester’s end with open-ended, reflective questions that included an explanation of bias and how to avoid it. The second treatment group did the same alternative assessment, but not until the start of the following semester.

Students’ responses were then reviewed by external readers with expertise in the evaluation of teaching, who rated how positive, constructive, and specific students’ feedback was, and noted any mention of the professors’ identities.

There were differences among students’ evaluations based on which group they were in. The control group, which was asked more directed questions, gave the most specific feedback. The second treatment group, which evaluated their courses later, provided more positive feedback. But neither intervention significantly reduced bias against professors who are women. In each of the three scenarios, women were consistently scored lower on average than men.

“Even though it’s in a sense disappointing we didn’t find a way to reduce the bias, I think it’s a really good reminder to people that it’s really hard to get rid of,” said Ann Owen, the lead author of the paper and a professor of economics at the college, who also chaired the faculty committee.

Here’s another problem: The external readers, who were asked to look for markers of instructors’ identities, found few. So while the evaluations were biased, those biases didn’t present themselves in a way that’s obvious to a reader — even an expert one.

So where does that leave things? Student evaluations are biased, and those biases appear to be stubborn. Just knowing this, though, isn’t especially helpful to professors when they’re using evaluations for their typical purpose: determining which professors to promote, offer tenure, or, in the case of adjuncts, keep on for the next term. Did a particular professor get negative feedback because of bias, or because of ineffective teaching? Reading the evaluations can’t answer that.

All evidence is flawed, Owen points out. It helps not to rely completely on one kind. There’s a place for student perspectives in the evaluation of teaching, but their comments should be one part of the story. Other evidence, like professors’ self-reflections and peer review, should also be used. Drawing on multiple sources can illuminate whether negative student evaluations point to a problem with a professors’ pedagogy or treatment of students, or if students are perhaps penalizing a woman who teaches a big course and makes it hard to earn an A.

And that, indeed, is the direction Hamilton has decided on. A few years ago, it passed new tenure and promotion guidelines, and departments are working to revise their own accordingly. (Owen’s department, economics, has completed its revision.) The guidelines ask departments to describe in specific terms what good teaching means to them, and what evidence they will use to identify it. They also push departments to use multiple forms of evidence when possible.

I am very happy CSU has decreased the dependence on student evaluations for promotion, tenure and retention decisions. I clearly remember discussions centering on the proportion of students that rated a faculty member above a 3.0 as an instructor. We now use a broad range of measures. I do wonder if this is also true for deciding teaching awards. I hope you find these posts helpful for your teaching. As always, I appreciate your questions, comments and feedback on this and other teaching related topics. Happy week 11.

Cheers, Paul

Colorado State University

College of Natural Sciences

College of Natural Sciences

Stay Connected