Now, before anyone breaks into their best Edwin Starr impression, let me preface this by saying that I don’t think the answer is “absolutely nothing.” But, I do think we need to consider what student evaluations are really measuring, because evidence suggests that it is not the best metric of teaching skill.
Before I go any farther, I’d also like to recognize that these biases are not at all limited to gender. The most work has been done on gendered bias in evaluations, but we know that implicit racial biases also play a big role in evaluations, and women of color are particularly vulnerable to evaluation biases, since they are impacted by both gendered and racial biases.
At the bottom of this post, there is a long list of recent studies and reports on this topic, if you’re interested in more reading. Many thanks to Rebecca Kennedy, a professor at Denison University and a member of the steering committee for the Women’s Classical Caucus, who compiled this list of studies which paints a detailed (albeit bleak) picture of the role that student evaluations play in shaping the make-up of academic departments. A brief sample of their conclusions should suffice here, since the details vary, but the overall message is largely the same, across disciplines (all emphases below are added):
- “We document a negative effect of being a female teacher on student evaluations of teaching, which amounts to roughly one fourth of the sample standard deviation of teaching scores. Overall women are 11 percentage points less likely to attain the teaching evaluation cut-off for promotion to associate professor compared to men…Our results are suggestive of a gender bias against female teachers and indicate that the use of teaching evaluations in hiring and promotion decisions may put female lectures at a disadvantage.” (Wagner, Rieger, and Voorvelt 2016)
- “In our experiment, assistant instructors in an online class each operated under two different gender identities. Students rated the male identity significantly higher than the female identity, regardless of the instructor’s actual gender, demonstrating gender bias. Given the vital role that student ratings play in academic career trajectories, this finding warrants considerable attention.” (MacNell, Driscoll, and Hunt 2015)
- “Academic excellence is allegedly a universal and gender neutral standard of merit…We challenge the view that the academic world is governed by the normative principle of meritocracy in its allocation of rewards and resources…we argue that academic excellence is an evasive social construct that is inherently gendered. We show how gender is practiced in the evaluation of professorial candidates, resulting in disadvantages for women and privileges for men that accumulate to produce substantial inequalities in the construction of excellence.” (Van den Brink and Benschop 2012)
To my mind, a few key points jump out here. The first is that female instructors are expected to be more nurturing and to offer greater interpersonal support. In the same way that women are often expected to perform more of the service work in academic departments, gender norms also predispose our students to think that women will be warmer and more accessible than their male counterparts. I highly doubt that most of our students are aware of this, but implicit bias and stereotype threat play a very real role in the classroom and not only can this impact our classroom environment and (eventually) our teaching evaluations, but it can also negatively impact our research. Imagine two equally dedicated teachers, one male and one female—both are committed to helping their students learn and are willing to “go the extra mile” to help their students. However, in the case of the female instructor, since her students expect that she will be more available and more nurturing, they are willing to make requests of her that they wouldn’t make of her male counterpart. Or, in another variation on this unfortunately-common theme, that a female instructor has the same gendered expectations of herself, and she spends more time and effort than she should scheduling extra office hours and making extra resources for her students.
Why is this such a problem? Well, for graduate students, there are two problems:
- Graduate students at Michigan have a contract. Part of that contract states “a one-half employment fraction normally requires a probable weekly time commitment of sixteen and one-half to twenty hours per week.” If we work beyond what we are contracted for, we are working unpaid overtime, and pressuring our colleagues to do the same.
- We need to finish our dissertations. Time spent teaching is time not spent researching or writing. As much as many of us wish that teaching was our primary (or even only!) responsibility as a graduate student, it really isn’t, and we need to make progress toward graduation before our funding runs out.
But there’s one additional problem, which relates to these, and it’s a doozie: if evaluations don’t accurately reflect the time and effort that we put into teaching, because of a gender bias, then men can get comparable evaluations with significantly less effort. The logical extension of this is that with the same time and effort, men will receive much more positive evaluations than their female counterparts. Without some major changes, evaluations will never be a fair metric for evaluating instructors.
This brings us to a very big problem: student teaching evaluations are often the primary method used to evaluate teaching quality, which means that student evaluations can carry a great deal of weight in hiring decisions as well as tenure and promotion cases. These biases in student evaluations have an effect, then, on the gender makeup of academia. They play a role in shaping higher education and they can serve to propagate the same implicit gendered expectations for future generations of academics. After all, if evaluations favor male instructors, then men are that much more likely to get tenure, based on the ostensible strength of their teaching. That means that students will continue to think of “professor” as a male-gendered profession, and these stereotypes will continue, unimpeded, and the face of academia will not change in any substantive way.
The other major issue here is that while evaluations are shown to be biased, they are not shown to correlate to good teaching. As a recent study’s title suggests “Student evaluations of teaching (mostly) do not measure teaching effectiveness.” From another study: “Student evaluations of teaching (SET) are strongly associated with the gender of the instructor. Female instructors receive lower scores than male instructors…But SET are not strongly associated with learning outcomes.” (Boring, Ottoboni, and Stark 2015). Stark, one of the researchers on that study, summarized “Our analysis would support an argument that the use of SET has adverse impact on female instructors, at least in the two settings we examined.” Stark suggested that lawsuits would likely motivate the end of student evaluations as a consideration for tenure or promotion, if changes were not made for other motivations.
As a final note, I’d like to include The New York Times’ article whose title nicely sums up what’s at stake with discussions of student evaluations: “Is the Professor Bossy or Brilliant? Much Depends on Gender.” It took, as its starting point, an interactive graph that Benjamin Schmidt made, using student reviews from RateMyProfessor.com. Among the findings, Claire Cain Miller noted for The New York Times, it showed that “Men are more likely to be described as a star, knowledgeable, awesome or the best professor. Women are more likely to be described as bossy, disorganized, helpful, annoying or as playing favorites. Nice or rude are also more often used to describe women than men.”
So, is there nothing good or salvageable about teaching evaluations, if they’re this deeply permeated with unconscious biases? Funny you should ask…That’s exactly what I’ll be talking about in the exciting Part 2 of this blog post!
Studies on bias and teaching evaluations
- Bennett, Sheila K. “Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation.” Journal of Educational Psychology 74, no. 2 (1982): 170.
- Boring, Anne. “Gender biases in student evaluations of teachers.” Document de travail OFCE 13 (2015).
- Boring, Anne, Kellie Ottoboni, and Philip B. Stark. “Student evaluations of teaching are not only unreliable, they are significantly biased against female instructors.” Impact of Social Sciences Blog (2016).
- Carnes, Molly, Patricia G. Devine, Linda Baier Manwell, Angela Byars-Winston, Eve Fine, Cecilia E. Ford, Patrick Forscher et al. “Effect of an intervention to break the gender bias habit for faculty at one institution: a cluster randomized, controlled trial.” Academic medicine: journal of the Association of American Medical Colleges 90, no. 2 (2015): 221.
- Centra, John A., and Noreen B. Gaubatz. “Is there gender bias in student evaluations of teaching?” Journal of Higher Education (2000): 17-33.
- MacNell, Lillian, Adam Driscoll, and Andrea N. Hunt. “What’s in a name: exposing gender bias in student ratings of teaching.” Innovative Higher Education 40, no. 4 (2015): 291-303.
- Saul, Jennifer. “Implicit bias, stereotype threat, and women in philosophy.” Women in philosophy: What needs to change (2013): 39-60.
- Van den Brink, Marieke, and Yvonne Benschop. “Gender practices in the construction of academic excellence: Sheep with five legs.” Organization 19, no. 4 (2012): 507-524.
- Wagner, Natascha, Matthias Rieger, and Katherine Voorvelt. “Gender, ethnicity and teaching evaluations: Evidence from mixed teaching teams.” ISS Working Paper Series/General Series 617, no. 617 (2016): 1-32.