Monday, September 26, 2011

Do we need to reboot the CS publications process?

My friend and colleague Dan Wallach has an interesting piece in this month's Communications of the ACM on Rebooting the CS Publication Process. This is a topic I've spent a lot of time thinking about (and ranting about) the last few years and thought I should weigh in. The TL;DR for Dan's proposal is something like arXiv for CS -- all papers (published or not) are sent to a centralized CSPub repository, where they can be commented on, cited, and reviewed. Submissions to conferences would simply be tagged as such in the CSPub archive, and "journals" would simply consist of tagged collections of papers.

I really like the idea of leveraging Web 2.0 technology to fix the (broken) publication process for CS papers. It seems insane to me that the CS community relies on 18th-century mechanisms for peer review, that clearly do not scale, prevent good work from being seen by larger audiences, and create more work for program chairs having to deal with deadlines, running a reviewing system, and screening for plagiarized content.

Still, I'm concerned that Dan's proposal does not go far enough. Mostly his proposal addresses the distribution issue -- how papers are submitted and archived. It does not fix the problem of authors submitting incremental work. If anything, it could make the problem worse, since I could just spam CSPub with whatever random crap I was working on and hope that (by dint of my fame and amazing good looks) it would get voted up by the plebeian CSPub readership irrespective of its technical merit. (I call this the Digg syndrome.) In the CSPub model, there is nothing to distinguish, say, a first year PhD student's vote from that of a Turing Award winner, so making wild claims and writing goofy position papers is just as likely to get you attention as doing the hard and less glamorous work of real science.

Nor does Dan's proposal appear to reduce reviewing load for conference program committees. Being a cynic, it would seem that if submitting a paper to SOSP simply consisted of setting a flag on my (existing) CSPub paper entry, then you would see an immediate deluge of submissions to major conferences. Authors would no longer have to jump through hoops to submit their papers through an arcane reviewing system and run the gauntlet of cranky program chairs who love nothing more than rejecting papers due to trivial formatting violations. Imagine having your work judged on technical content, rather than font size! I am not sure our community is ready for this.

Then there is the matter of attaining critical mass. arXiV already hosts the Computing Research Repository, which has many of the features that Dan is calling for in his proposal. The missing piece is actual users. I have never visited the site, and don't know anyone -- at least in the systems community -- who uses it. (Proof: There are a grand total of six papers in the "operating systems" category on CORR.) For better or worse, we poor systems researchers are programmed to get our publications from a small set of conferences. The best way to get CSPub to have wider adoption would be to encourage conferences to use it as their main reviewing and distribution mechanism, but I am dubious that ACM or USENIX would allow such a thing, as it takes a lot of control away from them.

The final question is that of anonymity. This is itself a hotly debated topic, but CSPub would seem to require authors to divulge authorship on submission, making it impossible to do double-blind reviewing. I tend to believe that blind reviewing is a good thing, especially for researchers at less-well-known institutions who can't lean on a big name like MIT or Stanford on the byline.

The fact is that we cling to our publication model because we perceive -- rightly or wrongly -- that there is value in the exclusivity of having a paper accepted by a conference. There is value for authors (being one of 20 papers or so in SOSP in a given year is a big deal, especially for grad students on the job market); value for readers (the papers in such a competitive conference have been hand-picked by the greatest minds in the field for your reading pleasure, saving you the trouble of slogging through all of the other crap that got submitted that year); and value for program committee members (you get to be one of the aforementioned greatest minds on the PC in a given year, and wear a fancy ribbon on your name badge when you are at the conference so everybody knows it).

Yes, it's more work for PC members, but not many people turn down an opportunity to be on the OSDI or SOSP program committee because of the workload, and there are certainly enough good people in the community who are willing to do the job. And nothing is stopping you from posting your preprint to arXiv today. But act fast -- yours could be the seventh systems paper up there!


  1. the arxiv wasn't popular in theory, but has become increasingly more so now. This has happened in conjunction (IMO) with a proliferation in the number of TCS conferences of reasonable quality, making it harder to track good work merely by conference.

  2. There's also the question of the current review process for receiving tenure and research funding, typically favoring high numbers of papers indexed by SCI/E.

  3. Why aren't your papers on arXiV? Any particular reason?


Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.