Skip to main content

All pain, all gain

Jon Howell from MSR came to visit yesterday and we got into an interesting discussion about the risk versus reward tradeoff for different approaches to research. Two of my Ph.D. students (Bor-rong Chen and Konrad Lorincz) are graduating this year -- you should hire them, of course -- but they are facing a weak job market and the need to rack up publications is as important as ever. The question is, had we worked on problems in the more traditional systems space, would we have been more successful cranking out the papers?

It has long been my belief that doing research in wireless sensor networks -- especially the applied and experimental variety, where you actually have to build something that works -- involves a (nontrivial) degree of difficulty that is not present in "traditional" systems research. Think of it: Programming motes requires that everything fit into 10KB of RAM. You don't get malloc, threads, or printf. All you get are three LEDs to tell you what's going on. Half the time a mote you have on your desk doesn't program correctly, or is simply fried, requiring that you go hunt around for another one. Scaling up to a full network requires debugging complex interactions between all of these motes, and keep in mind you typically don't get to inspect what each one is doing -- and comunication and node failures are rampant.

And God forbid you try to run anything in a real field setting (like a redwood forest, or even a volcano) -- then it really, REALLY has to work. As David Culler says, in a field deployment you can't just throw away the data you don't like and rerun it. You have to take what you get. More often than not the network doesn't work as you expect, or seemingly trivial problems (like water getting into the cases) causes nodes to fail.

It's been a while since I focused on conventional distributed systems, nodes running UNIX, connected to the Internet, that kind of thing. But it seems that it is considerably easier in that environment to build up something complex and debug it to the point where it works. After all you can ssh into the nodes, dump everything to a log, use tried-and-true methods and tools like gdb and strace. Of course, there's still plenty of heavy lifting involved. Hakim Weatherspoon's work on getting the OceanStore prototype to run on 400+ PlanetLab nodes for several months is no mean feat. But if I took away his ssh connections and replaced them with 3 LEDs, I wonder what he'd do. (Of course, Hakim would have still rocked it. But that's Hakim.)

This is not to diminish the intellectual contribution of mainstream systems research at all. Indeed, one could argue that the lower barrier to entry has made it possible for those working on conventional systems to innovate more rapidly and produce deeper insights than those of us battling broken motes and crappy radios. So, I wonder what advice I should be giving new grad students wading into the field. A lot of the low hanging fruit in sensor nets has been taken. To make a substantial contribution in the area you need to take things in a different direction than those before you. Fortunately, the TinyOS community has been doing a much better job lately at providing standard libraries and protocols to lower the bar. But there's still a lot of pain involved in getting to the research frontier. (Another post, I'll muse on why so many people work on MAC protocols. I suspect it's because it requires a lot less reliance on other people's code.)

My group has been doing more work with the iMote2 platform lately, precisely because I think it provides an easier-to-use, more functional vehicle for driving research. Mostly this is because it has a good enough CPU and enough memory to push on some interesting ideas without having to wring your hands over every byte of RAM your code uses. But going forward, I wonder if some of the "gap" that people see in the sensor nets space isn't merely due to the blood, sweat, and tears that goes into getting anything complicated to really work. We should think about how to remove some of those obstacles to innvoation, not to mention publication.

Comments

  1. I'll like to add a point of view. I agree with what you say with regards to additional services added to the basic Internet structure: you get a lot more resources, such as debuggers etc., to help with what you need to do. But I think there's a little of comparing apples to oranges: for sensor networks, you need to consider the infrastructure, which stretches from the radio hardware, MAC, routing protocol, all the way up till just below the application (e.g. temperature sensing). It's a complete system. But if you consider the equivalent for the case of the Internet, you need to consider the routers, the fibers, MPLS tunnels, DNS, all the way till, say, TCP.

    If we consider the same type of environment, i.e. infrastructural, then the hurdle in the Internet reaches obscene levels, much much much higher than sensornets. For the same effect, i.e. people use it in everyday lives, you need to work with many constraints, such as routing protocols (we aren't going to change OSPF, not like we can change routing protocol in sensornets) etc. That's just the start of the issue, because for actual deployment, there are equally, obscenely high political barriers as well. For the equivalent effect (real-world usage) in the Internet, it'll take years, not weeks in the case of sensornets. Fortunately (yes fortunately) for sensornet folks, it's not that hard (comparatively) to get past this infrastructure barrier.

    Because of this hurdle, the academic community has accepted "weaker" publications, "weaker" in the sense that we don't really know what happens if everyone decides to use the solutions, and simulations, analytical approaches are acceptable in its place. But it's different for sensornets because of the scale involved. If I need to monitor the temperature at different parts of a building, I buy some motes and deploy. I'm the only user, and the deployment is as complete as it needs to get. Thus, whether a sensornet idea works in real life or not can be determined much more easily. I mean, hey, it's right in front of our eyes :)

    This acceptance of "weaker" work (this doesn't mean that the traditional Internet researchers do any less work, but that whether the solution works or not is less convincing) caused a rift between Internet infrastructure researchers (e.g. service provider research centers) and the pure academic community. You are absolutely right in that there are huge amounts of effort, a lot of blood and sweat, put into making the infrastructure work; unfortunately the pure academic community no longer acknowledges that. This community no longer treats "getting things to really, really work" as important, certainly not as important as the solution itself (which need not, or indeed, cannot be as convincingly shown as in sensornets).

    This, I believe, is a main reason why GENI is proposed. Like Scott Shenker mentioned in one of his talks, Internet research has become "science fiction", not experimental research. Internet researchers have been publishing like crazy, but nobody knows if things really work when millions of people actually use the solution.

    ReplyDelete
  2. Good point - it's true that Internet research tends to take a more layered approach, and there is literally gobs (that is a technical term) of work that has never been deployed or tested in practice. We're fortunate in the sensor nets space that we can build a sensor net in a lab or a building and control all aspects of the system from soup to nuts. It's nice to be on the frontier without being tied to legacy components and applications.

    ReplyDelete
  3. I agree with the post RE: the job market. Full system deployment papers, especially sensors, are time consuming and not good for the publications/time ratio. On the other hand, I think that hiring committees are really impressed by this style of top to bottom, solving real problems work in a job talk. So perhaps your students would be more appreciated once they get past the first stage and give the interview talk.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why I'm leaving Harvard

The word is out that I have decided to resign my tenured faculty job at Harvard to remain at Google. Obviously this will be a big change in my career, and one that I have spent a tremendous amount of time mulling over the last few months.

Rather than let rumors spread about the reasons for my move, I think I should be pretty direct in explaining my thinking here.

I should say first of all that I'm not leaving because of any problems with Harvard. On the contrary, I love Harvard, and will miss it a lot. The computer science faculty are absolutely top-notch, and the students are the best a professor could ever hope to work with. It is a fantastic environment, very supportive, and full of great people. They were crazy enough to give me tenure, and I feel no small pang of guilt for leaving now. I joined Harvard because it offered the opportunity to make a big impact on a great department at an important school, and I have no regrets about my decision to go there eight years ago. But m…

Rewriting a large production system in Go

My team at Google is wrapping up an effort to rewrite a large production system (almost) entirely in Go. I say "almost" because one component of the system -- a library for transcoding between image formats -- works perfectly well in C++, so we decided to leave it as-is. But the rest of the system is 100% Go, not just wrappers to existing modules in C++ or another language. It's been a fun experience and I thought I'd share some lessons learned.

Why rewrite?

The first question we must answer is why we considered a rewrite in the first place. When we started this project, we adopted an existing C++ based system, which had been developed over the course of a couple of years by two of our sister teams at Google. It's a good system and does its job remarkably well. However, it has been used in several different projects with vastly different goals, leading to a nontrivial accretion of cruft. Over time, it became apparent that for us to continue to innovate rapidly wo…

Running a software team at Google

I'm often asked what my job is like at Google since I left academia. I guess going from tenured professor to software engineer sounds like a big step down. Job titles aside, I'm much happier and more productive in my new role than I was in the 8 years at Harvard, though there are actually a lot of similarities between being a professor and running a software team.

I lead a team at Google's Seattle office which is responsible for a range of projects in the mobile web performance area (for more background on my team's work see my earlier blog post on the topic). One of our projects is the recently-announced data compression proxy support in Chrome Mobile. We also work on the PageSpeed suite of technologies, specifically focusing on mobile web optimization, as well as a bunch of other cool stuff that I can't talk about just yet.

My official job title is just "software engineer," which is the most common (and coveted) role at Google. (I say "coveted&quo…