Wednesday, April 20, 2016

Why I gave your paper a Strong Reject

Also see: Why I gave your paper a Strong Accept.

I'm almost done reviewing papers for another conference, so you know what that means -- time to blog.

I am starting to realize that trying to educate individual authors through my witty and often scathing paper reviews may not be scaling as well as I would like. I wish someone would teach a class on "How to Write a Decent Goddamned Scientific Paper", and assign this post as required reading. But alas, I'll have to make do with those poor souls who stumble across this blog. Maybe I'll start linking this post to my reviews.

All of this has probably been said before (strong reject) and possibly by me (weak accept?), but I thought I'd share some of the top reasons why I tend to shred papers that I'm reviewing.

(Obligatory disclaimer: This post represents my opinion, not that of my employer. Or anyone else for that matter.)

The abstract and intro suck. By the time I'm done reading the first page of the paper, I've more or less decided if I'm going to be reading the rest in a positive or negative light. In some cases, I won't really read the rest of the paper if I've already decided it's getting The Big SR. Keep in mind I've got a pile of 20 or 30 other papers to review, and I'm not going to spend my time picking apart the nuances of your proofs and evaluation if you've bombed the intro.

Lots of things can go wrong here. Obvious ones are pervasive typos and grammatical mistakes. (In some cases, this is tolerable, if it's clear the authors are not native English speakers, but if the writing quality is really poor I'll argue against accepting the paper even if the technical content is mostly fine.) A less obvious one is not clearly summarizing your approach and your results in the abstract and intro. Don't make me read deep into the paper to understand what the hell you're doing and what the results were. It's not a Dan Brown novel -- there's no big surprise at the end.

The best papers have really eloquent intros. When I used to write papers, I would spend far more time on the first two pages than anything else, since that's what really counts. The rest of the paper is just backing up what you said there.

Diving into your solution before defining the problem. This is a huge pet peeve of mine. Many papers go straight into the details of the proposed solution or system design before nailing down what you're trying to accomplish. At the very least you need to spell out the goals and constraints. Better yet, provide a realistic, concrete application and describe it in detail. And tell me why previous solutions don't work. In short -- motivate the work.

Focusing the paper on the mundane implementation details, rather than the ideas. Many systems papers make this mistake. They waste four or five pages telling you all about the really boring aspects of how the system was implemented -- elaborate diagrams with boxes and arrows, detailed descriptions of the APIs, what version of Python was used, how much RAM was on the machine under the grad student's desk.

To first approximation, I don't care. What I do care about are your ideas, and how those ideas will translate beyond your specific implementation. Many systems people confuse the artifact with the idea -- something I have blogged about before. There are papers where the meat is in the implementation details -- such as how some very difficult technical problem was overcome through a new approach. But the vast majority of papers, implementation doesn't matter that much, nor should it. Don't pad your paper with this crap just to make it sound more technical. I know it's an easy few pages to write, but it doesn't usually add that much value.

Writing a bunch of wordy bullshit that doesn't mean anything. Trust me, you're not going to wow and amaze the program committee by talking about dynamic, scalable, context-aware, Pareto-optimal middleware for cloud hosting of sensing-intensive distributed vehicular applications. If your writing sounds like the automatically-generated, fake Rooter paper ("A theoretical grand challenge in theory is the important unification of virtual machines and real-time theory. To what extent can web browsers be constructed to achieve this purpose?"), you might want to rethink your approach. Be concise and concrete. Explain what you're doing in clear terms. Bad ideas won't get accepted just because they sound fancy.

Overcomplicating the problem so you get a chance to showcase some elaborate technical approach. A great deal of CS research starts with a solution and tries to work backwards to the problem. (I'm as guilty of this, too.) Usually when sitting down to write the paper, the authors realize that the technical methods they are enamored with require a contrived, artificial problem to make the methods sound compelling. Reviewers generally aren't going to be fooled by this. If by simplifying the problem just a little bit, you render your beautiful design unnecessary, it might be time to work on a different problem.

Figures with no descriptive captions. This is a minor one but drives me insane every time. You know what I mean: A figure with multiple axes, lots of data, and the caption says "Figure 3." The reviewer then has to read deep into the text to understand what the figure is showing and what the take-away is. Ideally, figures should be self-contained: the caption should summarize both the content of the figure and the meaning of the data presented. Here is an example from one of my old papers:


Isn't that beautiful? Even someone skimming the paper -- an approach I do not endorse when it comes to my publications -- can understand what message the figure is trying to convey.

Cursory and naive treatment of related work. The related work section is not a shout-out track on a rap album ("This one goes out to my main man, the one and only Docta Patterson up in Bezerkeley, what up G!"). It's not there to be a list of citations just to prove you're aware of those papers. You're supposed to discuss the related work and place it in context, and contrast your approach. It's not enough to say "References [1-36] also have worked on this problem." Treat the related work with respect. If you think it's wrong, say so, and say why. If you are building on other people's good ideas, give them due credit. As my PhD advisor used to tell me, stand on the shoulders of giants, not their toes.

11 comments:

  1. Background: I am neither in academia nor publish scientific papers. However, as a research software developer, I read a lot of them.

    I understand your pain regarding badly written papers. These usually turn out to be the ones with unbelievable promises and results. FYI, these come to my desk to be prototyped. However, the following statement stuck out to me.

    "Keep in mind I've got a pile of 20 or 30 other papers to review, and I'm not going to spend my time picking apart the nuances of your proofs and evaluation if you've bombed the intro."

    I think the bigger problem is that each reviewer gets 20 to 30 papers to review.

    The next question is, "If you had only 2 or 3 papers to review, would you mind spending your time picking apart the nuances of the proofs and evaluating if the author just bombed the intro?"

    If your answer is no; then there is nothing else that can be done. It's just an reviewer's bias where everyone has their own opinion. If the answer is yes, it indicates a deeper issues with reviewing system than with the writing.

    ReplyDelete
    Replies
    1. I strongly disagree with you. Science is more about being able to communicate one's results than about the results itself. It's actually my duty as an author to write my paper in a way that intrigues the reader to read on. After all, publications are not about having a publication but, instead, should be read by many more people after they have been published. This means that I (as a reviewer) wouldn't really do the respective authors a favor if I accepted a poorly written paper because I thought that the idea had some merit. If such a paper gets accepted, it will have no impact at all since no one besides the original reviewers will read it. And if that's due to the paper's presentation, then it's really a waste that the potential of the ideas was not fully realized.

      We don't write papers for ourselves - we do it for our readers. And that also means that we absolutely have to prepare papers in the best possible way for the reader. It's crucial to be direct in a review. If the presentation is really bad, we should write so clearly and give some recommendations for improvement.

      Actually, as an author I'm angry with myself if the majority of the reviewers doesn't point out how well the paper is written. I can live with rejects based on the ideas. But if reviewers don't understand my paper/my ideas or if the review says that the presentation was poor and that the paper has to be rejected based on that - then that was my mistake.

      Delete
    2. This is a pretty good point. Much has been written about how broken the scientific paper review process is (on this blog and elsewhere). Still, I don't feel that I owe authors a substantial review (and chunk of my time) just because they submitted a paper. If on skimming a paper it's clear it's going to get rejected anyway, I'll write a cursory review and move on with other things in my life. While there are things that could make the review load lighter, I don't think those would have a big impact on my disposition towards poorly-written papers.

      Delete
    3. Well, one problem is that the review process serves two independent purposes: (1) Choose a set of papers for the conference. (2) Provide feedback to the authors. If the goal is only (1), then skimming a paper that is to be rejected is fine. If the goal is also (2), then authors are owed some feedback.

      And increasingly, I see papers submitted for purpose (2), i.e., with the rough expectation that they will be rejected (though the hope that they will be accepted, of course), but with the goal of getting some feedback.

      Delete
  2. Hi Matt. I subscribe each and every word you've written!!!

    ReplyDelete
  3. Making unsupported assertions. Don't claim "A is B" as a fundamental motivation of your work unless you can actually support it. In the area of performance, be especially careful not to automatically add "and slow" after "complex" in an assertion - there's a huge difference between "difficult for you to understand" and "necessarily slow".

    Resembling a high school science report. (or "I don't want to read your journal of personal discovery.") A good paper describes what you learned, and the experimental path a reader might take to reproduce your results, not the random walk that actually occurred. (a really good paper can address some of your research mis-steps, but if you need this advice then you don't write really good papers.)

    ReplyDelete
  4. I'll add another, which has started to truly infuriate me: Not including any results at all.

    I don't care about your ambitions, your desires, and your architectural notions. If you don't have any evidence to back up your statements, it's not science. I don't demand that people be doing real-world experiments, but at least run some simulations or prove a theory or *something*. I've gotten a number of these recently, and it's gotten to the point where after I read the abstract, I flip to the end to see if there are any results or not before I even start on the introduction.

    ReplyDelete
    Replies
    1. I would argue that Results is the most meaningless of all sections, and should not even be included in a paper. How can we trust one's own evaluation of their own work? It has been shown that even work presented at top conferences in a large majority of cases cannot be reproduced:
      http://cacm.acm.org/magazines/2016/3/198873-repeatability-in-computer-systems-research/fulltext

      Evaluation should not be done by the authors. You don't trust a car dealer, but you take a car out for a test drive before buying it. I don't see why you'd trust an academic who doesn't even share their code/data.

      Delete
  5. I have always appreciated Matt's posts, because they ask or imply deeper questions about the highly systematized and rarely challenged value hierarchy that pervades academia. But this post is rather disappointing, because it does not seem as if Matt's time outside of academia has provided him much perspective on this issue.

    The above content might be acceptable if he were coaching his PhD students at Harvard (and indeed, I'm sure he said much the same to his students, it's good working advice. In many cases, he may have jumped in and written those same introductions himself right before the paper was submitted.) Or if it had been entitled "Pro-tips for PhDs and Professors" or "How (Some) Reviewers Think", it would have been great. But instead, it has taken the form of a self-righteous declaration of entitlements as a reviewer.

    The essential message of Matt's article is "I reserve, even cherish, the right to reject your paper if I don't think your writing is good enough." In an increasingly international research community, this elitist argument clearly gives the advantage to those with Anglo- and Western- backgrounds, in the same way that the SAT Verbal section was a set of memorized code words that ever-so-conveniently happened to be in common use by upper middle class families across America and ensured their continued access to top universities. In a community with notoriously low accept rates, placing an emphasis on artful writing clearly favors a select group. Of course, we can always argue that somebody who really deserves to publish here will learn all of those code words, or the art of writing "eloquent introductions", as Matt refers to. Perhaps it's just me, but don't the Program Committees of the systems community seem a little bit white compared to most of CS? Maybe I'm wrong.

    I believe in strong writing, possibly the product of a liberal arts background that is sufficient to equip me to understand the bias that underlies Matt's argument. It irritates me to no end when I have to suffer through bad writing. But at the end of the day, globalization requires that we acknowledge that science is about advancing human knowledge at the fastest rate possible, and not about writing pretty prose. If the cure for Ebola is on page 3, after a crappy introduction, you should accept. Or, for that matter, if they improve data centers by 0.5%, you should also accept because that could indirectly be on the path to the cure to Ebola.

    A more PC version of what Matt could have said would be "Well written papers are more likely to communicate your ideas and improve your chances of acceptance, and of others citing your work." But I think it's good Matt wrote what he did because it gives us reason to reflect.

    ReplyDelete
    Replies
    1. I think I was clear in my article that I try to avoid bias based on whether the authors are native English speakers. I think it's unfair to levy an accusation of cultural bias against me, and especially to do so hiding behind an anonymous comment.

      Bad writing is bad writing. I think most people who have served on program committees will agree that when they have a large pile of papers to review, papers that are well-written will be treated more kindly. If you want to argue that papers should only be evaluated based on scientific merit and not the quality of exposition, I think you'd be in the minority.

      Delete
  6. "(...) explain what you're doing in clear terms (...)"
    That's in my opinion is a good recipe for mainstream research papers... in which you tell a story about what you've done to incrementally improve some “hot topic”. You yourself say that: "If by simplifying the problem just a little bit, you render your beautiful design unnecessary, it might be time to work on a different problem". However innovation cannot always be fully explained by the established way of communicating.
    My opinion is that the well written papers you are referring at are somewhat boring papers… their average reading time is not driven by the content but by the acceptance/rejection task, their value is directly proportional to the quantity of empirical results are able to show. These papers are mostly muscular and with limited novelty. Please note that I'm saying they are boring, I'm not saying that they are not important for the community.
    Writing papers is a form of a liberal art: the mainstream research needs disruptive underground research to feed its model business, underground research needs the mainstream research model business to survive. A perfect parallel can be made with underground music… Blues, jazz, rock, metal, hip hop, electronic were all underground music at their beginning and as such they were “strongly rejected” by the society (I.e. community)… all of “current” music is deeply influenced by all of them!
    It is also well known that some of the “most seminal” papers were indeed initially rejected, since they were not following any “good” writing recipe.

    ReplyDelete

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.