Friday, July 15, 2011

How do you evaluate your grad students?

One of the issues that I always struggled with as an academic -- and I know many other faculty struggle with -- is keeping grad students on track and giving them useful feedback to help them along in their careers. PhD students often get lost in the weeds at some point (or many points!) during grad school. Of course, part of doing a PhD is figuring out what you want to work on and doing things that might seem to be "unproductive" to the untrained eye. On the other hand, many PhD students grind to a halt, spending months or even years on side projects or simply doing nothing at all. One problem my own students often had was working super hard to submit a paper and then doing almost no new work for 2-3 months while waiting to get the reviews back.

When a student gets stuck in a rut, how do you help them out of it? How do you help students clear a path to productivity?

One thing that many PhD programs lack is any regular and formal evaluation of a student's progress. Harvard never did anything formal, although I tried to get something going it did not last beyond one year -- not enough faculty cared to participate, and we couldn't agree on the process or desired outcomes. At Berkeley, every PhD student got a formal letter every year with a "letter grade" indicating how well you were doing and with some optional comments from your advisor on your overall progress. Although that feedback could have been delivered informally, there was a psychological impact to the formal letter and the idea that all of the professors were meeting in a smoky room to talk about your standing. CMU has its infamous  "Black Friday" where all the profs literally do get together to score the grad students. Not having been to CMU, I wonder how this was viewed by the students -- did they find this feedback valuable, stressful, or just plain annoying?

Although this kind of feedback can be useful, for many students it goes in one ear and out the other. I think that part of the reason is that there is often no penalty for doing poorly on a review -- about the only thing a PhD program can threaten you with is kicking you out, and most programs that I know of avoid that unless there's a case where a student has been totally unproductive for a period of several years. It's hard to get kicked out of grad school. By the same token there's little incentive to do well on a review: you're not going to graduate any sooner or get paid more. (Sidebar - should PhD programs pay high-performing grad students bonuses?)

The other issue is that these mechanisms are somewhat open loop in the sense that the student is not expected to lay out a plan and stick to it. Most PhD programs expect students to file some kind of formal plan of study or research leading towards their degree, but it is usually a matter of paperwork and is done just once, or maybe twice, during the course of the program. This has almost no value to the student and is just a matter of paperwork. My feeling is that students would benefit tremendously from a more frequent and formal planning process.

At Google, the approach we use for planning is based on OKRs, or "objectives and key results." Every employee and team is expected to come up with their OKRs for the coming quarter, and score the OKRs from the previous quarter in terms of how much progress was made towards each goal. This is extremely useful process since it gets you thinking about what you need to do over the next 3 months (which seems to be about the right planning horizon for most activities) and you have the chance to reflect on your successes and failures of the previous quarter. It's not expected that you achieve 100% of your goals -- if you are doing so, then your OKRs were not ambitious enough -- you should be shooting for a grade of 70-80%.

I wonder if grad students wouldn't benefit from using something like OKRs for planning their research. A student should be able to say what they are doing over the next 3 months. Looking back on the previous 3 months and grading your progress tells you whether you are generally on track. Having quarterly OKR scores can also help advisors point out where the student needs to improve and documents clear-cut cases where a student has been unproductive (something that both students and advisors are often in denial about). Thoughts?


  1. I'm a soon-to-be-finishing CMU Phd, and the faculty review/Black Friday feedback has generally been a good thing for me and most of the other students I know. For the most part, we do take the faculty evaluations seriously. As part of the process, we write a brief letter outlining our accomplishments for the semester & plans for next semester. Similar to OKRs, actually, but probably less formal. (for the past couple semesters, mine have basically been "write thesis".) The faculty feedback gives us a letter grade, as well as outlines expectations for the next semester.

    This creates a reasonable process for formally communicating progress, informing students they're not meeting standards and formulating a plan for fixing things if necessary. For me, it assured me in the first couple years at CMU that I was actually doing OK, despite my lack of publications at the time.

  2. Hi Matt,

    While I am for this idea, I firmly believe the expectation from both the advisor and the student should be laid down "objectively". Practically, one can't expect a graduate student (at least in the initial years) to things by his own if the mentor doesn't pitch in enough. I suggest keeping logs of meetings and specific outcomes (to which both the advisor and student agrees). After all, it takes two to tango! :-)

    I would like to solicit your thoughts.

  3. "should PhD programs pay high-performing grad students bonuses?"

    The answer to this question can be only one: YES.
    The biggest problem I have experienced several times with my students is that if everybody is payed the same amount, the very motivated students achieving the best results will soon feel "over-exployed".
    As a result, they will inevitably tend to collaborate less actively to those activities (e.g. joint papers and joint projects) in which also the other less performing students are involved ("Why should I work so much to help other people that do not care and that in the end are payed the same as me???").

  4. Good ideas. Regular feedback should be part of any program.

  5. As a current RI student at CMU I second Jon's comment. I personally see black friday letters as an opportunity to define future objectives as well as ask and get feedback. Most people are prompted to do that before the meeting just so you don't end up with surprise comments.

    Mcenley, as far as I can tell it helps to have the reviews with all the faculty present precisely because others can step in if there are misunderstanding in the advisor/advisee tango. For example, even if an adviser does not believe their student is making a lot of progress other faculty can speak up and vouch for the student.

  6. Trying to stick to predetermined objectives is a bad idea for graduate students. Sure, getting side tracked for too long can be a problem; but in some cases it can be far more useful than continuing to work towards the original objectives. My original research project got unceremoniously dumped at the start of my third year when I talked to my supervisor and we agreed that the "side project" I had been working on for the past 6 months had enough results already to constitute a thesis.

  7. Colin - of course, that is the nature of research - the great thing about a 3 month planning window is that you can always course correct on each cycle or even in the middle of a cycle. The fact that plans might change is not a reason to not do any planning.

  8. Interesting discussion. I'll contribute a different perspective: I am a computational biology phd student at an ivy.

    I think graduate students are already over-burdened with needless evaluations. For instance, my program requires yearly committee meetings, which means that every June I stop all experiments, kill my cells, store DNA constructs away, and then work on writing a 10-15 page document on my research which is presented to the committee along with an hour-long presentation on the work done in the previous year.

    I've been through this process 3 times since I passed quals, and it has been utterly useless each time. For starters, my evaluators never bother to read even a full page of the document. Secondly, they tend to be dismissive of work that is at the planning stage; for instance, I was advised by two faculty members to stop working on x because they thought I was going down a completely blind alley; of course, after the work got accepted by a pretty good journal, they became extremely supportive....

    It would be interesting to see a study comparing periodic student evaluations to final outcomes.....I have a feeling that the correlation would be strong, but negative.

    Also, I think paying bonuses to some students is a horrible idea -- it will only encourage more ass-kissing, which some students are great at.....the best thing about being a graduate student is being able to take risks, and introducing bonuses will inevitably veer bright students into boring, safe endeavors.

  9. Normally I'm a big proponent of market-based incentives, but I'm not sure it makes sense to pay grad students a bonus for high achievement. The reward for high achievement in grad school is a good job after grad school.

  10. I'd like to share some of my experiences (I'm an almost finished Psychology grad student in Ireland).

    My Uni had very little formal evaluation throughout the process. I submitted 10K words to be upgraded to PhD, but that was extremely informal. Then, the central admin decided to introduce mandatory evaluations every year, when I had finished collecting data.

    This was extremely useful, as it was very intense and I had a two hour viva like experience with the head off department and other review panel members. It would have been wonderful if it had happened for my review, as I was left with a thesis almost done and having to revise matters that had been flagged to the department in advance and which I had thought were fine.

    I suppose my issue is the inconsistency. Any process that is used needs to be consistent across the entire program, as ambushing people when they are almost done is not a good idea

  11. Re: "should PhD programs pay high-performing grad students bonuses?"

    I am not sure this would work, mainly because I can't think of a good metric that would work. How can we objectively measure and compare how much effort was put in by someone?

    Students can get stuck for a while without a breakthrough in sight. The paper trail is not a good metric either.

    Ultimately, you'd have to rely on the subjective judgment of advisors. But that only gives you a partial order at best.

  12. Anon re: "Needless evaluations". Sounds like the problem in your case is that the process in your department is not working, so that's why it has so little value. Clearly having a process that does not yield results is a waste of time. I think that having to write a 10-15 page document each year is really high overhead and was not at all what I was suggesting. OKRs at Google are very short - usually a half page or so of bullet items - so the overhead is reasonably low.

  13. Good management is good. Bad management is bad. Formal processes with good management can be good. With bad management they are excruciating. Seven years from now, if you're still at Google and they're still doing quarterly OKRs, I bet you'll be a lot less keen on them.

    Mentoring talented young people is hard. There's a constant give and take between giving them enough rope to do amazing things, but not so much they hang themselves. And the things they need vary wildly from one to the next.

    Some students would benefit from OKR-like structure, when applied properly. But I know an extremely talented student who dropped out of a top program because his advisor tried to implement something like that. Ironically, he left for Google (where he seems quite happy).

    Personally, I got very little out of the status letters at Berkeley. Of course, my recollections may be jaundiced by the one I got when I graduated, which was along the lines of "thanks for making space for new students".

    I think trying to give bonuses to "good" Ph.D. students is a terrible idea. A talented computer science Ph.D. student is making chump change compared to their open-market value. Differential pay would require a lot of process, waste a lot of time, have potential for resentment, and ultimately motivate for the wrong thing.

    The right performance rewards are things that leverage the long-term career investment. I still remember Sue Graham taking me to a DARPA Compiler Infrastructure meeting and introducing me to (among others) Ken Kennedy and John Hennessey. I spent an hour at dinner in an animated conversation with John.


  14. Hi Matt,

    Are you still heading the Harvard Sensor Networks Lab?

  15. I can't recall the name for the effect that measuring has on people/process. It was named by an economist, I think. The premise is that when you start assessing by measuring, the process is skewed to getting good measurements and can degenerate into meaningless measurements.

    As with any process, a lot depends on its practitioners. As a student of engineering discipline, I saw tests and exams motivate students to score well than really understand and appreciate the subject. I can anecdotally say that a lot of my classmates who did really well in their exams ended up in dead end jobs (as in government jobs doing not much and hoping to get a time-based promotion) and some who had their heads in clouds and didn't do as well in exams fared better. I have seen a similar trend in software. As soon as something is formalized and measured, they seem to become rituals. Again, requirements and design documentation (highly templatized documents that everyone ignores), testing and coverage analysis (a lot of times focus is on numbers and percentages rather than an intelligent strategy of understanding risks and mitigating them), code review process (which quickly degenerates into a ritual of superficial conformance to some standard (doing a grand job of code review is harder than writing it in the first place!).

    I realize I have gone tangential to the subject at hand but my point was to provide a perspective where formal evaluation processes tend to lose their purpose and become ineffective.

  16. Having been a grad student and now a Googler, I feel that the OKR system works much better in a production environment where the problems to be solved are more short term and somewhat better understood. Research problems are often much more open-ended, I feeling instituting a process for 3-month planning can unnecessarily focus the students more on short term rather than the big picture.

    For this reason, and my own experience in grad school, I'd second mcenley's comment that it's way more important for students to receive detailed constructive feedback from the advisor, or senior students in the group, especially in the earlier years, when they haven't developed their own system to productively identify problems and solving them. Getting a letter-grade evaluate without suggestions on how to improve may not be very helpful.

  17. I wish to comment on your problem of "...often working super hard to submit a paper and then doing almost no new work for 2-3 months while waiting to get the reviews back".
    So, what do you expect...? Keep up the same momentum for the rest of the term. That's humanly not possible. A good work takes a lot out of you. Secondly, we also wish to know how our hard work is received by others. That will give us direction for the next work...

  18. Deeply_Hurt: I'm sorry, that's just ridiculous. Nobody is saying that you shouldn't take a break after a big deadline, but 3 months of downtime while waiting for paper reviews is not productive.

  19. 1) I agree w/Matt, 3 months of downtime? wtf. That puts a hard cap on how much work you can do. And given the competitiveness of today's post-PhD job market, an advisor would be doing you a disservice if he/she allowed that much downtime. I try to give my students 1 week of real downtime, followed by 2-3 weeks of reading/ramp up for the next project, that way they are usually multiplexing on 2-3 simultaneous projects and always have something in the pipeline.

    2) I personally found the official letters from the dept at Berkeley not terribly useful. We have something similar at UCSB. It is useful for the faculty to meet and talk (over 2 fac mtgs for us), but I find it more useful to have a special 1-on-1 w/ students at the end of the year and also at beginning of summer to look back and evaluate the past 6 months, usually for positive reinforcement on what they did well (negative feedback tends to be a bit more immediate).

  20. Matt & Ben: Should such a system not also exist for rating of a Professor by his/her students? I have seen countless examples and heard stories from fellow grads about their advisors who get far too complacent the moment they get their coveted "tenure" trophy! It is far too easy to ride a grad student and rate his over all performance over a period of several months, but a disinterested, and complacent advisor is already someone most students have to deal with.

    Wonder what thoughts of an ex-Professor and a current Professor are on this issue?


Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.