Skip to content

Amelia, it was just a false alarm

Nah, jet fuel can’t melt steel beamsI’ve watched enough conspiracy documentaries – Camp Cope

Some ideas persist long after the mounting evidence against them becomes overwhelming. Some of these things are kooky but probably harmless (try as I might, I do not care about ESP etc), whereas some are deeply damaging (I’m looking at you “vaccines cause autism”).

When these ideas have a scientific (be it social or physical) basis, there’s a pretty solid pattern to be seen: there is a study that usually over-interprets a small sample of data and there is an explanation for the behaviour that people want to believe is true.

This is on my mind because today I ran into a nifty little study looking at whether or not student evaluation of teaching (SET) has any correlation with student learning outcomes.

As a person who’s taught at a number of universities for quite a while, I have some opinions about this.

I know that when I teach my SET scores better be excellent or else I will have some problems in my life. And so I put some effort into making my students like me (Trust me, it’s a challenge) and perform a burlesque of hyper-competence lest I get that dreaded “doesn’t appear to know the material” comment. I give them detailed information about the structure of the exam. I don’t give them tasks that they will hate even when I think it would benefit certain learning styles. I don’t expect them to have done the reading*.

Before anyone starts up on a “kids today are too coddled” rant, it is not the students who make me do this. I teach the way I do because ensuring my SET scores are excellent is a large part** of both my job and my job security. I adapt my teaching practice to the instrument used to measure it***.

I actually enjoy this challenge.  I don’t think any of the things that I do to stabilize my SET scores are bad practice (otherwise I would do it differently), but let’s not mistake the motives.

(For the record, I also adapt my teaching practice to minimize the effects of plagiarism and academic dishonesty. This means that take-home assignments cannot be a large part of my assessment scheme. If I had to choose between being a dick to students who didn’t do the readings and being able to assess my courses with assignments, I’d choose the latter in a heartbeat.)

But SETs have some problems. Firstly, there’s increasingly strong evidence that women, people of colour, and people who speak english with the “wrong” accent**** receive systematically lower SET scores. So as an assessment instrument, SETs are horribly biased.

The one shining advantage to SETs, however, is that they are cheap and easy to measure. They are also centred on the student experience andnd there have been a number of studies that suggest that SET scores are correlated with student results.

However a recent paper from Bob Uttl, Carmela A. White, and Daniela Wong Gonzalez suggests that this observed correlation in these studies is most likely due to the small sample sizes.

The paper is a very detailed meta-analysis (and meta-reanalysis of the previous positive results) of a number of studies on the correlation between SET scores and final grades. The experiments are based on large, multi-section courses where the sections are taught by multiple instructors. The SET score of the instructor is compared to the student outcomes (after adjusting for various differences between cohorts). Some of these courses are relatively small and hence the observed correlation will be highly variable.

The meta-analytic methods used in this paper are heavily based on p-values, but are also very careful to correctly account for the differing sample sizes across studies. The paper also points out that if you look at the original data from the studies, some of the single-study correlations are absurdly large. It’s always a good idea to look at your data!

So does this mean SETs are going to go away? I doubt it. Although they don’t measure the effectiveness of teaching, universities increasingly market themselves based on the student experience, which is measured directly. And let us not forget that in the exact same way that I adapt my teaching to the metrics that are used to evaluate it, universities will adapt to metrics used to evaluate them. Things like the National Student Survey in the UK and the forthcoming Teaching Excellence Framework (also in the UK) will strongly influence how universities expect their faculty to teach.


*I actually experimented assuming the students would do the reading once when I taught a small grad course. Let’s just say the students vociferously disagreed with the requirement. I’ve taught very similar material since then without this requirement (also with some other changes) and it went much better. Obviously very small cohorts and various other changes mean that I can’t definitively say it was the reading requirement that sunk that course, but I can say that it’s significantly easier to teach students who don’t hate you.

** One of the things I really liked about teaching in Bath was that one of the other requirements was to make sure that scatterplot of a student’s result in my class against an average of their marks on their other subjects that semester was clustered around the line y=x.

***I have unstable employment and a visa that is tied to my job. What do you expect?

**** There is no such thing as a wrong english accent. People are awful and students are people.


  1. Paul Alper says:

    I believe this is the 50th anniversary of the Vietnam War, one of whose products was SET, student evaluation of teaching. Previous to that war’s unspeakable disaster, student evaluation of teaching was more or less an unthinkable thought. Because of the opposition to the Vietnam War, students demanded all sorts of other things from universities; but the one item that stuck was SET because it is a useful device for administrators who can thus claim to be honest, disinterested scorekeepers.

  2. Andrew says:


    1. I have mixed feelings on all this. On one hand, sure, there’s the fallacy of measurement and the problems with taking student evaluations too seriously. On the other hand, I remember what it was like to be a student, and it was (a) wonderful to have an official opportunity to give feedback on classes, and (b) at times useful to find out what other students thought of different teachers. So, just because it turns out that numerical averages of teacher evaluations are not predictive of student outcomes, I disagree with your (implicit) claim that these evaluations should “go away.”

    2. I don’t know if there’s a “wrong” accent, but, for the purpose of working with a particular group of students, some accents can certainly be worse than others! If you can’t understand what the teacher is saying, that’s a problem!

    • Ben says:

      “at times useful to find out what other students thought of different teachers”

      Students have access to the results of these reviews? The only place I know to go is RateMyProfessor, which usually only has a couple dubious reviews. Otherwise it’s a coinflip and you just get who you get unless you know someone who’s already taken the class.

      • D Kane says:

        > Students have access to the results of these reviews?

        This varies significantly by institution. At Harvard, students can see all past reviews, unless a faculty member actively prevents distribution (which is rare and, of course, sends its own signal). At Williams, students see nothing.

        • Martha (Smith) says:

          My university does make “course instructor survey” results available to students. In addition, student organizations often organize “slam tables” at registration time. These are tables set up in a public place on campus that are covered with paper “tablecloths” on which students can write comments about professors and courses. They give students the opportunity to make (and see) comments that might not be addressed by the course instructor surveys.

    • yyw says:

      Second the accent comment. Accents are associated with levels of English communication skills. English is a foreign language to me and I have a strong accent. From my experience both teaching and as a student, accents could have a negative effect on teaching effectiveness. This could depend on student population too. For example, to me, a Scottish or even an English accent is the “wrong” accent while an American (but not southern) accent would be ideal.

    • Mikhail says:

      I was learning Measure Theory from a French professor with a strong accent. For a first few lectures I thought “miserable function” was a thing.

    • David Austin says:

      There are different kinds of uses to which results of student satisfaction snapshots (student evaluations) are typically put; four uses that come to mind are: (i) as a component of high-stakes employee assessment (e.g., hiring, firing, promotion, salary); (ii) to provide administrators with difficult-to-gather evidence about instructor behavior (e.g., was the instructor habitually late for class, did the syllabus specify any required text(s)); (iii) to inform instructors about student reactions; and, if results are made public, (iv) to inform other students about student reactions. (The latter is less common in part because in some states, the results for instructors at state universities are part of personnel files that are, by state law, confidential.)

      The published research shows that student satisfaction snapshots are likely to be misused in connection with (i). A useful overview of relevant research is provided at Dennis E. Clayson’s web site: Student Evaluation of Teaching,
      [In the US, US EEOC User Guidelines for Employee Screening, reflecting court cases on disparate impact,
      29 CFR 1607.5 General standards for validity studies
      29 CFR 1607.14 Technical standards for validity studies and
      may imply that the results of student satisfaction snapshots are not sufficiently trustworthy to be used legally in high stakes employee assessments, but there appears to be no relevant case law as yet.]
      (Institutions of higher education have academic integrity policies and usually make some effort to enforce them. Do they have policies that lead them actually to check student satisfaction snapshots for satisficing or lying? Do they ever announce the results of those checks? Provide them to instructors? I’ve not heard of any examples of any of those.)

      Student satisfaction snapshots are not reliable sources for (ii), as research by Stephen Porter
      William Standish
      and Nicholas Bowman
      (among others) indicates.

      As, e.g., Valen Johnson (Grade Inflation, 2003) and Talia Bar et al (2009)
      found, making course/instructor grade distributions available to students had a significant effect on enrollments at Duke and Cornell, increasing enrollments in courses with higher grades and expanding majors with more such courses. It seems clear that making the results of student satisfaction snapshots public would have a similar effect. The title of Johnson’s book is apt because a major factor in determining departmental budgets is enrollment. (iv) is thus problematic. (But wouldn’t I want to know my surgeon’s success rate?! No, not unless I also had reliable information about the kinds of patients on whom she performed surgery, why she lost that malpractice case, etc. Would most students be careful with summaries of student satisfaction snapshots? Are students typically able to tell how accurate they are in determining the ability of fellow students to judge instructors? How much damage will be done to students’ educations when they misuse course and instructor information in what are predictable ways? I also doubt that faculty members’ judgments – made when they were students – about their instructors would be representative of the judgments made by most students about most instructors.)

      That would leave (iii) of the four. As long as instructors understand that the reactions summarized by student satisfaction snapshots should have a limited effect on teaching, (iii) seems an unobjectionable use and could be a good one. Many instructors must already deal with racist, sexist or other sorts of vicious remarks in student satisfaction snapshots.

      Student satisfaction snapshots are not going away. (Were they to go away, they might be replaced by something worse.) It is worth limiting any misuse of the results. (i), (ii) and (iv) appear on balance to be misuses.

      In the meantime:

      Robert J. Youmans and Benjamin D. Jee, “Fudging the Numbers: Distributing Chocolate Influences Student Evaluations of an Undergraduate Course,” Teaching of Psychology v34 n4 (2007) 245-247.

      Chocolate cookies have been found similarly efficacious (and there is some evidence that chocolate may not be necessary but why take chances?).

      • Florian Wickelmaier says:


        Thanks for these references. I agree that the term “student satisfaction snapshots” is more descriptive than the term “student evaluation.”

        This blog post by Arthur Poropat (Students don’t know what’s best for their own learning) refers to two studies that looked at later student performance. Among his conclusions:

        “Students don’t recognise learning … students assess whether they have learnt something based on the ease with which they complete a related task.”

    • Dan Simpson says:

      Yes. That they’re probably not measuring how well a student learned as much as some dimensions of how a student felt while they were learning doesn’t make them useless. But they’re messy instruments to use for official decision making.

      Personally I find the free comment fields the most useful, which are not covered by this paper.

  3. Emmanuel Charpentier says:

    I seem to remember something [close](, with some [consequences]( in the “distant past”…

  4. yyw says:

    I am always wary of these systematic bias claims. To make these particular claims, you need to show that teachers belonging to certain groups receive lower scores all else being equal (course subject, student population, true teaching performance, inter-personal skill, etc.). This task seems impossible without devoting a huge amount of resource.

    • D Kane says:

      Exactly. I read Dan as suggesting that, a priori, he knows that teaching ability is uncorrelated with gender so any correlation which shows up must be evidence of “bias.” But lots of things are, truly, correlated with gender! Why wouldn’t teaching ability be one of them?

      • Mikhail says:

        Gender prejudice affects a lot of things. Why wouldn’t SET be one of them?

        Given that I know about human body and human society, I would put my prior on explanation by systemic bias rather than genetics.

    • Ben Prytherch says:

      One way to do this is to take an online class with multiple sections run by the same instructor whom the students never see or hear, and have the instructor assume a male name for one section and a female name for the other. An example:

      • yyw says:

        A study with 43 completed subjects (10% missing excluded) divided into 4 discussion groups using extremely questionable methodology to get a significance at 0.1 level for the primary measure. I don’t know how it could get past peer review and yet it has been cited 171 times in 3 years.

      • Martha (Smith) says:

        An anecdote showing how strange things can be:

        My first year teaching, I taught one of three largish sections of the same course. All three sections had the same exams, graded jointly by the three instructors. My section was one of two sections that met at the same time. After a while, I noticed that I had a number of students attending my class who were registered for the other section meeting at the same time. I learned later that the instructor of that section got better teaching evaluations than I did.

        • David Austin says:

          M. Oliver-Hoyo “Two Groups in the Same Class: Different Grades.” Journal of College Science Teaching, 38(1) (2008) 37-39.

          recounts the experience of an instructor (then an associate professor) who taught two sections in the same classroom at the same time and received significantly different evaluations from the two sections. One of the matters about which there was disagreement was how available the instructor was outside of class. All students were informed in the same way about the instructor’s office hours, she reports having kept them, and many students made use of them. Those who received lower grades in the course tended to rate the instructor as significantly less available during office hours and they gave the instructor lower ratings overall.

          (The author’s site is but I did not find a readily accessible online source for the paper.)

          I’ve heard of a case in which students were asked to rate the lab assistant for a course with no associated lab sections and thus no lab assistants. (A data entry error resulted in generation of a set of online evaluation forms for a non-existent lab section.) The non-existent lab assistant reportedly fared about as well as the instructor.

          Undergraduates at larger state universities take a standard course load of five courses each semester, some with lab sections attached, and a large minority if not a majority work at jobs off-campus during the semester. (Having a job on campus that takes 20 hours or less weekly does not seem to result in reduced GPAs. Having a job off-campus for over 20 hours weekly does seem to make things much harder on students.) Rating forms typically have 15-45 items, and online delivery is also typical. It’s not surprising that some respondents have trouble keeping clearly and firmly in mind what happened throughout the semester in each course (and any associated lab sections), especially given that ratings are done near the end of semesters when workload tends to increase (in part, but only in part, because of procrastination).

      • Jacob says:

        Though this isn’t perfect, since the gender bias may not be because the instructor’s gender is known but because the instructor is punished for seeming to conform to gender norms or fitting some kind of stereotype. A simple version of this not captured by the aforementioned design is that a man’s voice may sound more intelligent/authoritative/etc. to the student’s ear while the a woman’s voice may not, even if the content is the same. These are tricky things to measure.

      • David Austin says:

        The results of this relatively small study were re-analyzed by Philip Stark [] using non-parametric permutation methods:
        The re-analysis showed larger differences (than in the original analysis) between ‘male’ and ‘female’ instructors and stronger evidence of anti-female bias.

  5. D Kane says:

    > This means that take-home assignments cannot be a large part of my assessment scheme.

    Can you (or someone else) elaborate? I am planning to teach a course in which all the assessments (problem sets, take-home exams and final project) are done outside of class. Should I be more worried about plagiarism? (I teach at an elite US private university, to the extent that this matters.)

    • Alex says:

      100% you should, yes. I’ve had students plagiarize from each other and from other sources (books, internet, etc.). I’ve had it happen at an elite private university and at a lower-level public school. Kids want to do well, or at least pass, and they will cheat. Depending on the topic, they’ll plagiarize because they don’t know it’s plagiarism, or that plagiarism is wrong. It doesn’t happen often, but it happens.

      • D Kane says:

        Advice on dealing with this, both in terms of preventing its occurrence and catching it when it occurs?

      • Dale Lehman says:

        If plagiarism is really what you want to focus on, then sure it is 100% yes that you should be concerned with that. The day that my focus turns to that will be when I finally retire. There is far too much else that is important in teaching. In-class exams (and worse yet, closed book exams) are not good ways to test students. They prioritize particular types of knowledge and particular types of abilities – not many of us really need to think on our feet under a strict time constraint. Take home exams provide a much better platform for learning and evaluation, in my mind. To protect against plagiarism, you can make the assignments novel and challenging enough that it is hard to use someone else’s work. And, if you still want to worry about it, you can have every student come for a short discussion about their answers – I think that will cure the problem (but will take considerable time – but with a payoff, I think).

        On the topic of SETs, I have a unique perspective, having earned both very high ratings and teaching awards, as well as some of the lowest ratings ever received. The strange thing is that for the first 10 years of my teaching career I taught at 10 different universities (one year temporary contracts) and my ratings went straight downhill every year as I became a better teacher. I then learned how to play the game better and get somewhat average ratings (by averaging a bimodal distribution of students) in order to protect job security. But there are a few observations I feel strongly about:

        1. SETs from undergraduate and graduate students are quite different. More generally, the types of students will fundamentally affect what is rewarded in SETs.
        2. I find the numerical ratings close to worthless – only the written comments I find meaningful (and they are usually quite revealing).
        3. I’ve taught at schools where almost all of my colleagues got nearly perfect ratings while I could not match those. But when I looked at their evaluation forms (I’m old – they were on paper in those days), virtually no students wrote anything on the forms. In other words, the instructor was perfect but the students had nothing to say about them. I highly discount such evaluations.
        4. Some evaluations are fraught with spelling and grammatical errors – these I discount as well.
        5. Regardless of the flaws in SETs, they are better than nothing. They are feedback and all feedback is important for teaching. The problem comes when they are used to evaluate job performance. I would say that a few peer evaluations (with classroom observations) are far more important than insisting on a “good” set of numbers from the SETs. That so few institutions do this well is testament to more serious problems with university teaching. I’ve even had obligatory classroom observations from colleagues who sat in the back of the room grading papers while I was teaching.

        After 40 years of teaching, I’ve become quite skeptical about the profession. Many of us take it seriously, but overall the system works very poorly. Little real improvement takes place except by accident. The measures we use often get in the way of improving – getting better scores is not the same thing as becoming a better teacher. But, on the other hand, we utilize all of our “researcher degrees of freedom” when interpreting the scores that we receive. The environment in which we work and are evaluated is just as poor as the research environment – poor incentives, unproductive peer rivalry, and increasing external oversight (assessment and accreditation, anyone?) that is unproductive.

        We should not view SETs independently of the overall teaching enterprise.

        • Radford Neal says:

          I quite agree – the first priority in teaching is to teach the material. For almost all topics, doing take-home assignments is crucial to actually learning the material. So such assignments should definitely be set. With very disciplined students, they could be non-credit, submitted only so the marker can provide helpful comments. But the reality is that most students are not that disciplined, and will do them only if they count for a non-trivial part of the course grade – this is so even for students who are sincerely trying to learn.

          There are also students who are not sincerely trying to learn, but for some reason think that passing the course without learning anything is of benefit to them. These are sad cases, both for themselves and for anyone hiring them later, but the existence of such students should not lead to the course being degraded to the disadvantage of sincere students.

          Regarding course evaluations, the comments are indeed the most interesting part. One comment I received recently was “There is nothing stupid about this course”, which I interpreted as being less about my course than about some others they must have taken.

          • AllanC says:

            In your second paragraph it appears to me you implicitly assume that for any given student, performing the coursework as intended confers a greater chance of achieving true understanding of the material then not. I’m not entirely convinced this is the case. Though, admittedly, my lack of conviction is heavily weighted on my personal experience.

            • Radford Neal says:

              Sure. There can be bad instructors who have entirely mistaken ideas about how to teach the material. There’s no magical cure for that.

              But maybe you mean that the instructor’s approach might be generally helpful, but not for one particular student? So that student would benefit by not doing the assignment, and doing something else instead? I don’t think this is a common occurrence, in the sense that the student would have reason to be confident ahead of time that something other than the assigned work would be substantially better for them. (Remember, this is a student who presumably doesn’t already know the material, so how do they know how to teach it?)

              When it does happen (eg, when a student is much more talented than the others, and wants to do something more challenging) the instructor is likely to agree to a special arrangement.

        • Martha (Smith) says:

          ” SETs from undergraduate and graduate students are quite different. More generally, the types of students will fundamentally affect what is rewarded in SETs.”

          What I have observed is that (not horribly surprisingly) CIS’s (Course Instructor Evaluations) tend to be on average lowest for large freshman courses, a little higher for second year courses, a little higher for upper division courses taken mostly by majors, a little higher for beginning graduate courses, and highest for small graduate topics courses. (I even once got a perfect “grade” for one such course.)

          • Dale Lehman says:

            My conclusion regarding my own experience is that self-selection is much stronger than people realize. Once you are at a school for even a second year, your reputation gets around and students begin to select instructors they will like (for whatever reason). Since I was teaching at 10 different school for 10 years, I got a truer measure of how students perceived my efforts – and it was not a pretty sight. But I knew I was a better teacher – I knew virtually nothing when I started and though I had become a pretty good teacher by the time these 10 years had passed.

    • AllanC says:

      How worried you should be depends on your expectations for your student’s conduct.

      With any takehome assignment there is a near certainty that: some will perform the work solo, some will attempt to find the solution online, some will use assignments passed down from previous years as a reference, some will form groups and hash it out together, some will outright copy from completed assignments of other students.

      What percentage do what is unknown and will depend on the type of course you are teaching. For example, engineering assignments with only one correct solution are very likely to be entirely copied from past years assignments (if the same,
      and most are usually on a 5 year rotation) or copied from other students.

      If you believe it’s important for their learning that the assignments be done without aid, then I’d say you should be worried. If you believe the process of figuring out how to get things done by itself is a useful skill to teach, then I’d say don’t worry about it.

    • Ethan Bolker says:

      I concur with the other replies pointing out that open ended work without time pressure is an important part of making learning both possible and testable. I don’t want to deprive honest students of this opportunity simply to police those (few?) who will cheat. Here’s my start-of-semester handout:

      (which contains a link back to postings on this blog).

    • Robin Morris says:

      I gave a take-home assignment last quarter. I spent time in class explaining how I wanted them to do it individually. I spent time reminding them that I had failed a number of students earlier in the quarter for cheating on the midterm. I put a line at the top of the assignment where they had to sign that the work they were submitting was their own.

      Four hours later, the assignment had been posted to

      [And in an “I couldn’t have written this better if I’d tried”, the solutions that were posted to chegg were wrong, in a couple of very distinctive ways….]

  6. Greg Snow says:

    This article in Chance magazine from several years ago talks about this:

    Valen E. Johnson (2002) Teacher Course Evaluations and Student Grades: An
    Academic Tango, CHANCE, 15:3, 9-16

    This was more than just a correlation study. They found that the biggest indicator of SET was the student’s anticipated grade (what they thought they would get, not what they actually earned/received). With higher anticipated grades resulting in more favorable SETs.

    I think it is mentioned in that article, but detailed elsewhere that actual learning (based on independent test after the course rather than grade) had a negative correlation with the anticipated grade (expecting a lower grade lead to more effort and learning more). This would suggest that lower SETs would correspond to more actual learning (but making that a preferred outcome for promotion would make gaming the system even easier).

  7. Two things haven’t been mentioned about SET yet, so I thought I’d throw them out there:

    1) Teachers who teach in such a way that the students do better on follow-on courses routinely get *worse* evaluation scores. Making students learn things doesn’t make them happy in the way that they will give high scores. This research has been done, google it, I haven’t had breakfast yet (or maybe wait and I’ll post citations later).

    2) Response rate: both to attendance of the class, and filling out the surveys. Students will routinely fail to attend class and then fill out the surveys (usually with terrible reviews), and other students will routinely attend the class and participate, and then not bother to attend the survey days (too busy studying for final exams in other classes etc). The types of students involved are very different. Overall response rates are often something like 20 or 30% for largish especially multi-section classes, for example computer programming classes with 150 people in lectures and 6 sections of lab with 25 students each or the like.

    With these two issues at play, I think SET is total complete and utter garbage, and particularly the only thing that seems to count for administrative use: the average numerical scores.

    Comments can be useful, but basically only from students who had some idea what was going on. For example “Worst TA ever, I stopped going to class” is not exactly useful. As a grad student TA I frequently got totally diametrically opposed comments, such as “TA spends way too much time explaining the concepts and doesn’t work through enough problems” and “TA is extremely good at explaining the concepts and this helped me be able to work through problems easily” (not literal quotes, but representative of the kind of thing).

    • Martha (Smith) says:

      And then were was the student who said something like, “She spends too much time on the basics and not enough time on the fundamentals.” (Or was it “spends too much time on the fundamentals and not enough time on the basics”?)

    • Martha (Smith) says:

      “Teachers who teach in such a way that the students do better on follow-on courses routinely get *worse* evaluation scores. Making students learn things doesn’t make them happy in the way that they will give high scores.”

      Yup. But the rewards come when a student tells you a year or more later, “I didn’t like the way you taught the course at the time, but now I realize that I was much more prepared for the next course than the other students, so I’m glad you did teach the way you did.”

  8. Ben Prytherch says:

    I agree with the comments saying that the numerical results are useless but the written comments can be useful.

    One thing that SETs should do, but mostly don’t, is give students some recourse when they get stuck with a lousy teacher. We’ve all had them. So what can students do in this situation? The bold ones will email the department chair or college dean or whoever, and then sometimes the complaint will go to someone who might take it seriously. I don’t think most students are that bold though. So, the only thing they can officially do is write about their experience in an SET and hope that the instructor takes it seriously or that the instructor’s supervisor is also reading.

    Do lots of students blame their teachers for their own poor work? Sure. Do lots of students have unreasonable ideas of what “fairness” entails and what “effective teaching” entails? Yes. But it’s also true that lots of teachers are lazy or unprepared or inattentive or just not very good at teaching. The students in their classes know this. We should be providing these students some venue for sharing this information, which will help to improve teaching when it needs to be improved.

  9. Related to the statement that written comments are much more meaningful than numerical evaluation scores: Here at the University of Oregon, we’re piloting end-of-term evaluations *without* numerical scores. I think it’s great. If you want to see a sample form:

  10. Alas, Amelia, things change. It must have been when I saw the movie “About a Boy” that I realized, painfully, that Joni Mitchell had become the punch line to a joke.

  11. Jacob says:

    As a grad student at a research university who independently teaches for the department, the only evaluation of my instruction that I receive is the SEIs (student evaluation of instruction). It’s clear that the only part of those surveys that my superiors look at is the “overall” item, on a 1-5 scale. It’s of course not a continuous measure, although we do treat it that way by averaging. It’s 5 if you’re satisfied, 4 if you’re mostly satisfied, or 1 if you are mad (my interpretation, these are not labeled choices).

    To be clear, I could request other forms of evaluation from a university center for teaching development and I suspect my advisor or some others would be happy to sit in on a class or classes and give feedback, but in terms of hiring and firing it’s obvious that the only thing the people in charge care about is that I not be a negative outlier on the overall score. Well, they also would care if students brought complaints to the department.

Leave a Reply