Wednesday, December 30, 2009

The Best Things about 2009

Being the end of the year, I think it's appropriate to reflect on some of the best things that happened in 2009.

Best use of official signature: graduating two Ph.D. students. I'm finally no longer a leaf node on the academic genealogy. Two of my graduate students, Bor-rong Chen and Konrad Lorincz, finished their degrees this fall, and I am insanely proud of them. It is really amazing to look back on all of their hard work over the last few years and reflect on what they accomplished. At the same time it's kind of sad to no longer have them in my group, although Bor-rong is sticking around as a postdoc on a new project that we have going with the Wyss Institute. (I'm kind of hoping he never leaves!)

Best application of Fernet Branca: The Toronto Cocktail. Fernet was a pre-2009 discovery, to be sure, but this particular combination of rye, Fernet, simple syrup, and bitters is by far the best way to mix it. I've been teaching bartenders around Boston how to make it; ideally with a flamed lemon peel. Runner up: The Trinidad Sour, which uses Angostura bitters as a base. Challenging.

Best album: Animal Collective's Merriweather Post Pavilion. This album is rich, joyful, perplexing, and beautiful. I've probably listened to it more than anything else I got in 2009, although my profile begs to differ. It's no surprise it topped Pitchfork's list of best albums in 2009. If you're not convinced, check out the video for "Summertime Clothes" here. Runners up: Bitte Orca by the Dirty Projectors; In Prism by Polvo.

Best use of stimulus money: RoboBees. One highlight this year was being a Co-PI on an NSF Expeditions in Computing grant to develop a colony of micro-scale flapping wing robots. Along with nine other faculty, we are tackling a bunch of exciting research problems, my particular focus being on systems and language support for coordinating the activity of the colony. This is going to be a fun project and a bunch of students are getting involved already.

Best reason for sleep deprivation: Becoming a dad. Having a baby has been the most challenging, and most rewarding, thing that has ever happened to me. After raising a puppy and advising eight Ph.D. students, I figured the fatherhood thing would be a cinch. Not so. But it has been a huge learning opportunity -- about myself, about what really matters in life, about setting priorities. Sidney is now almost six months old and is the cutest little fella I've ever seen -- I just can't wait to be able to take him to the zoo and teach him C++.

Here's to a great year and best wishes for 2010.

Wednesday, December 23, 2009

How to get your papers accepted

Like most faculty, I serve on a lot of conference program committees. I estimate I review O(10^2) papers a year for various conferences and journals. When reviewing so many papers, it is amazing to me how many authors make simple mistakes that make it so much more difficult to review (let alone accept!) their papers. Keep in mind that when reviewing 25+ papers for a program committee, you have to do them fairly quickly and the easier it is for the reviewer to digest your paper and get to the core ideas, the more likely they are to look favorably on the paper. I tend to review papers while on the elliptical machine at the gym, which also helps to tamp down any physical aggression I might feel while reading them. (Of course, I have to go back and write up my comments later, but usually in a post-exercise state of unusual mental clarity.)

A few hints on getting papers accepted -- or at least not pissing off reviewers too much.

1. Spellchcek.

Seriously, how hard is it to run your paper through a spellchecker before submission? Whenever I see a paper with typos - more than one or two - I figure the authors were too rushed or lazy to get something as simple as spelling right, and this casts doubt on the technical content of the paper as well. Sometimes typos creep through that a spellchecker won't catch - like using "their" instead of "there". You were supposed to learn that distinction in high school. (I have some bad habits of my own. For some reason, I always type "constrast" instead of "contrast" -- no doubt a holdover muscle memory from my days programming LISP.)

2. Get the English right.

This is a major problem for papers coming from non-native speakers, and although one is supposed to overlook this, nothing grates on a reviewer more than having to slog through a paper full of grammatical mistakes and strange wording choices. Sometimes the nature of the grammatical and stylistic problems can reveal the provenance of the authors: Asian writers tend to confuse singular and plural, while Indian writers tend to use convoluted "Indianized" expressions held over from the days of the Raj. (One paper I once reviewed used the Indian term crore -- meaning 10,000,000 -- as though everyone knew what that meant.) If in doubt, get a native (that is, American) English speaker to review the paper before submission. Be sure to throw a couple of "Go U.S.A.!"s in there for good measure; it'll mask your foreign identity.

3. Make the figures readable!

I can't tell you how many times I have been unable to read a figure because it was formatted assuming the reader would be looking at a PDF on a color screen, and able to zoom in to read the tiny letters in the legend. This is not yet possible with printed paper, and I tend to print in black and white, as I suspect many reviewers do. When formatting figures, I try to use colors that will have adequate contrast even in black and white, use thick lines with a variety of dash styles that make them easy to distinguish, and set 18 pt Helvetica font that will be legible when squashed down to figure size.

4. "Related work" is not just a list of citations.

In general I really dislike "related work" sections that merely list off a bunch of related papers without explaining how they differ from the paper at hand. The point behind this section is not to simply give a shout out to potential reviewers or to prove you've done your homework: it is to contrast the contributions of your paper from what has come before. Also, your goal is not to shoot down every other paper you have read, but rather to place your work in context and explain the lineage. It is OK if another paper has worked on a similar problem and even shown good results. This suggests you may not be barking completely up the wrong tree.

5. Make sure the intro kicks ass.

It is not uncommon for me to decide whether I'll mark a paper as "accept" or "reject" after reading the first page. Actually, more likely I'll have decided on a "reject" early on, and withhold any "accept" decision until I've read the whole thing. Still, a beautifully written introduction that makes a compelling case for the ideas in your paper goes a LONG way towards influencing the reviewer's disposition. David Patterson is the master at this. After you read the intro to, say, the Case for ROC paper, you think to yourself, "but of course! This is the best idea ever!" and then feel really crappy for not having thought of it yourself.

6. Get to the point.

The first paragraph of the introduction is an opportunity to dive into the subject of your paper, not an excuse to toss out some lazy canned problem statement copied from a dozen other papers you read last year. The first sentences from my last three papers were:
Wireless sensor networks have the potential to greatly improve the study of diseases that affect motor ability.

The unused portions of the UHF spectrum, popularly referred to as “white spaces”, represent a new frontier for wireless networks, offering the potential for substantial bandwidth and long transmission ranges.

Resources in sensor networks are precious.
All three tell you immediately what the paper is about; these are not throw-away statements.

7. State your contributions!

I can't believe how many papers never explicitly state the contributions of the work. Giving a numbered list of your contributions is essential, since it gets the reviewer focused on what you think is important about the paper, and it defines the scope of the eventual review. Too many papers lay forth platitudes of how the work will cure cancer and world hunger, but it's hard to tease that apart from how you've tweaked a timing parameter in 802.11. By the same token, contributions should be focused and concrete. Tell us specifically what you did, what the results were, and why it matters.

8. Don't bullshit.

Finally, don't exaggerate your results or claim more than you have really done. Nothing irks me more than a paper that promises to solve a huge problem and ends up showing a tiny sliver of the solution in a carefully-concocted setting. It is far better to understate your results and impress the heck out of the reviewers than overstate the results and let the reader down. Everyone knows that the path from design to prototype to results is filled with pitfalls, and you will be excused for having cut some corners to demonstrate the idea; but make sure the corners you cut were not too essential to the core contributions you are trying to make.

Following these eight simple rules, I guarantee your paper will be accepted to any program committee that I serve on! (Hope you're planning a SIGCOMM'10 submission!)

Tuesday, December 15, 2009

Post-NSDI PC Meeting Mini-Symposium

Today was the First Post-NSDI Program Committee Meeting Mini-Systems Symposium at Harvard (PNSDIPCMMSS@H2009). That is, I invited four of the NSDI'10 PC members -- Chip Killian, Dejan Kostic, Phil Levis, and Timothy Roscoe -- to give short talks at Harvard on their current research, since they were in Boston for the PC meeting anyway. It was a lot of fun - for me, anyway. Everyone else at least got a free lunch.

Chip started off with a talk on Understanding Distributed Systems, at least those implemented using Mace. He gave an overview of the Mace language (which is one of my favorite pieces of languages-meets-systems work from the past decade) and the application of model checking to automatically verify Mace programs. This, to me, is the main reason we need better high-level languages for specifying complex systems: so we can build useful tools that let us understand how those systems behave before and during deployment.

Phil gave a kind of far-out talk on a new system they are building at Stanford, called Meru, which is a federated, planetary-scale virtual world platform. The goal is to enable virtual worlds that are extensible and scalable in all kinds of ways that existing systems (such as Second Life) are not. Personally I'd like to see how this technology can be leveraged beyond games and entertainment. Why not enable virtual conferences, or at least virtual program committee meetings? There are a lot of challenges here but it's important to tease them apart from the games-driven work that has been done to date (and often quite successfully).

Dejan talked about his work on CrystalBall, which allows for online model checking to catch and even avoid bugs in a distributed system via execution steering. The running example was a buggy implementation of Paxos, and Dejan showed how their approach could avoid the bug by steering execution away from the states that led to it. Mothy asked, "as long as you're going to do this, how much of Paxos do you need to implement, anyway?" In some sense this is shifting the correctness of the system from the original code into the CrystalBall system itself, and that makes me nervous. Margo raised the point that if a system is able to avoid a bug, is there really a bug in the first place. Too deep for me!

Finally, Mothy gave a talk on their experiences implementing BarrelFish (now with Disney-friendly logo!) and some reflections on the value of undertaking a "big OS implemenation project." BarrelFish has led to some interesting offshoot research (such as clever compilers for DSLs and new ways to think about capabilities) and they have gained a lot by departing from just hacking Linux. On the other hand this is a tremendous infrastructure-building effort and having nine authors on a single SOSP paper strikes me as overkill. One thing that helps them a lot is having a couple of full time systems programmers -- not grad students! -- which I find is hard to fund on your typical NSF grant. The longevity of the artifact does seem to depend on having people around who can maintain the code over time.

I'll post slides here as soon as I have them!

Tuesday, December 8, 2009

Digg for grant proposal reviews

I am a huge fan of the social news site Digg. It is where I go to get my daily dose of Internet stupidity, ranging from XKCD comics to pictures of fail. For those that have been living in a cave the last five years, the way it works is that users submit links to random sites they find on the Internet, and those that like the link "digg" it, thereby increasing the link's popularity. Tracking Digg is a good way to keep your finger on the pulse of the Internet, or more accurately, the segment of the Internet that 15-to-22 year old boys seem to care about.

The best part of the site are the incredible comments left by the users. Often these are funnier and more obtuse than the original link, and Digg comments are something of a genre in and of themselves (good examples being repeated ASCII-rendered appearances of Admiral Akbar and something called Pedobear). Indeed, occasionally the comments can get out of hand.

Here's a crazy idea that I came up with (incidentally, while having wine and cheese with the chair of the Harvard stats department this afternoon). Why not adopt the Digg model to crowdsource peer review of grant proposals? Scientists would post their grant proposals publicly and anyone would be allowed to "digg" a proposal -- or "bury" a proposal that has flaws or a particularly bad idea. Public comments would be used to convey feedback to the authors and open up debate on the research plan to the many thousands of highly qualified Internet users who are conventionally excluded from review panels.

This model would seem to have all kinds of benefits. Rather than making funding decisions in the proverbial smoky room, requiring the funding body to spend untold millions in taxpayer money to fly panelists to DC and put them up in hotels for a couple of nights, this approach would bring everything out into the open. The Digg model would also streamline the review cycle to run in "Internet time" -- reducing the typical six month turnaround time to mere hours! Best of all, users on the site would adopt clever screen names like "W1F1d00d" and "Prof. BabyMan" lending the proceedings a certain edginess and panache sorely missing from the current panel review system.

If you like this idea, why not Digg it?

Friday, December 4, 2009

How to get into grad school

A bunch of students are now applying to graduate schools, and to help them out, every year I give a talk on getting into grad schools in Computer Science (click the link for the slides). Luis von Ahn has an amusing post about the process on his blog - all of his suggestions ring true.

The key thing that gets under my skin about graduate applications is the personal statement. All too often, applicants see this as an opportunity to tell their life story, especially about some experience they had with computers as a kid. "Since I was nine years old..." is the most common opening line in these statements. Frankly, I don't care about any of that. I am looking for potential grad students who have a mature and serious outlook about research. Of course, the best way to demonstrate that is to actually have done some research as an undergrad -- and putting together the Web site for your a cappella group doesn't count. My suggestion is for students to model the personal statement as a mini-research proposal: tell me about a problem you want to work on, have done some thinking about, and how you would approach it. And, convince me that you have the technical experience necessary to do graduate level work.

By the same token, I don't care about how enthusiastic a student comes across in their application to work with me. A lot of ass-kissing goes on in the personal statements sometimes and that drives me crazy. Just tell me how awesome you are, not how awesome I am, or awesome Harvard is. We know that already :-)

Here is my rough algorithm for screening Ph.D. applications:
  1. Make sure the GRE scores and GPA are reasonable - not necessarily stellar. (I only had a 3.4 GPA when graduating from Cornell, and was told later that this almost sunk my application to most grad schools. Fortunately, I had also published three papers by the time I applied to grad schools, which offset that. As a result, I tend to use a lower threshold for the GPA than some others, to catch the diamonds in the rough.)
  2. See if the student has any evidence of research experience -- supervised by a faculty member. Publications don't really matter but are helpful.
  3. If there is any black mark on the transcript (say, making a C in an important CS class), see if there is any explanation of that in the personal statement or elsewhere. (We once had an applicant who failed most of his classes one semester, but retook them and made A's the next term. Until I read the personal statement, it was not clear that this was because he had been hospitalized for a substantial portion of the term.)
  4. Finally, read the recommendation letters. These are the most important part. Steps 1-3 are just pre-screening to save myself the trouble of reading the bulk of the applications. Letters from people I know (or know of) are given higher weight. Letters from academics get priority over letters from industry - most industry letters (even from places like MSR) paint a very rosy picture. Letters that simply say "so-and-so took my class and made an A-" with no other content actually hurt an application, all else being equal. If a student is really stellar, even faculty who don't know the student well should be able to say good things.

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.