Wednesday, February 18, 2009

Blogging a research project?

This term I am teaching a graduate seminar on wireless sensor networks. Actually, "teaching" is not quite the right word, as I see my role as mainly that of moderating a discussion between the students in the course, and raising the occasional controversial point as grist for the mill. Normally, in these kinds of courses, the content of the discussion is lost to the ether, so this term I decided to run a blog where the students post a summary of our conversation about the assigned papers. Students are encouraged to put their own editorial slant on the content of the blog posting, and the blogger for each class is responsible for leading the discussion.

So far it's been a lot of fun and provides some permanence to those revelations and insights that, otherwise, would be terribly ephemeral. It also gives the students a chance to write up their ideas a bit more formally, with a broader audience than, say, simply emailing them to me.

This has got me thinking about the potential role of blogging in a research project. Timothy Gowers has started a very interesting blog-based mathematics project in which the blog itself serves as the medium for collaborative discovery. I started wondering whether this model might translate into, say, a computer systems research project. Sometimes this happens through email lists and impromtu collaborations between people who already know each other, but opening up the project on a blog seems to offer orders of magnitude more opportunity for networking and learning from one another in rapid and informal ways.

As an intermediate step in this direction, I am currently writing an NSF proposal for the Cyber-Physical Systems program (as is everyone else I know), and as part of the "broader impact" statement I've decided we're going to blog the research project, if it is funded of course. The idea is simple: Every week or so, one of the students or PIs will post a short article on the progress on the project to date, and invite comments. It's possible this has been done before, but I haven't seen any major systems projects adopt such a model.

I think this could yield a number of interesting results. The only way most systems research projects are presented to the world is through a small number of published papers. Necessarily, those papers capture the "successes" of the project and generally do not dwell upon the many blind alleys and outright failures that led up to the big result. By blogging the process of the project as it unfolds, other researchers, especially students, could learn from the mistakes of our work and also learn something about the trials and tribulations of a typical project. I'd love to tell one of my future grad students to simply read, say, a years' worth of a project blog to understand how what turned out to be a beautiful paper took so much hard work and hacking.

Also, blogging the effort could get potential collaborators and even the public a lot more interested in what we're doing. I love hearing about early prototypes and "conceptual designs" when they leak to blogs like Engadget; though most of these things never see the light of day, they can be pretty inspiring. Of course, there's always that fear that you'll get scooped if you tell the world about your great ideas before they are published in a conventional scientific venue, or that you'll look like an idiot when you blog about how you spent three weeks tracking down a missing minus sign in your code. (I have done this.) On the other hand, opening up the process of doing research seems to me to be the ultimate form of outreach and could offer the next generation of students a much better picture of what really happens in grad school.

Finally, this could potentially lead to a lot of unexpected collaborations getting started. Some of my best collaborations have started through random encounters: a former grad student of Margo Seltzer's introduced me to a geophysicist (Jonathan Lees at UNC) whom I have worked with on sensor networks for volcano monitoring, and someone attending one of my talks turned out to be looking for a wireless sensor solution for clinical assessment of neuromotor diseases (Paolo Bonato at the Spaulding Rehabilitation Hospital). My theory is the greater the surface area you expose, the more connections you're going to make.

The biggest risk, I think, is that of your ideas being stolen. However, I've always felt that the whole point of doing research is for others to take your ideas and run with them. For this reason I've always released the code my group develops under an open source license and simply not worried about who picks it up for what purpose. (But that's a topic for another post!) Realistically, we are always sharing our ideas, when giving talks, writing papers, or bumping into someone at a conference. Could blogging increase the bisection bandwidth?

Update (22 Feb 09) - Speaking of blogging research, here's the blog for the Berkeley Cloud Computing project.


  1. Matt -

    The grad course blog is a great idea - I wish I had such an archive of the courses I've taken over the years.

    I think a mashup of cvs/svn/git and blog software would be key to making systems research blogging work. (Systems projects always have many weeks where the only interesting updates pertain to code, and it be unfortunate not to be able to easily document that.)

    I've often wondered about the issue of being scooped, but I don't think it's all that likely (or bad, as you note); I've been considering including more research topics at my blog. Maybe we as a community need to establish a convention for citing blog entries?

  2. I think it's an interesting idea. However, I suspect people would use the medium to "advertise" their project as opposed to getting feedback on new, potentially interesting ideas (due to the risk of having your idea being "stolen"). This doesn't contradict your point about stimulating collaborations--certainly the more people who know what you're working on, the more chance there is that you'll make contact with someone with related interests. Nor does it contradict your point about getting the public interested (something scientists in general don't do a very good job of). But, I don't think we're going to be seeing blog posts from any research group about the brilliant idea they had that they just started working on.

  3. Very true (both comments) - part of this is a social experiment to see how much information we are comfortable revealing; and whether the blog medium lends itself to more sharing of what would otherwise be "private" information.

  4. First off, this sounds like a lot of fun to read. I look forward to seeing the results! Maybe less fun to write ("another miserable failure day"), unless of course it turns into a viable form of procrastination on the project itself.

    One thing that you don't seem to have mentioned yet : how to manage the acknowledgments and authorship of work produced through this blog experiment. If a commenter shows up once, makes an insightful comment that changes the direction of the research, and then disappears, is it a co-author? Probably not, but I bet we can come up with other examples further along the spectrum to "in my research group, came up with major ideas, and also wrote a bunch of working code." A related question is if you and your group are OK with having a paper that has potentially dozens of authors, some of them pseudonymous. :)

    Of course the real answer will be "it depends" and "it's a judgment call." Still, I think it's an interesting problem that could show up when showing not yet "set" research to the world. Stating your expectations and "ground rules" up front may be a good idea to avoid misunderstandings later.

  5. In terms of managing authorship, I guess I didn't envision that comments on the blog would necessarily constitute contributions to the research in and of themselves (unlike Gowers' project). It's like the hallway conversations we have at conferences: If Alex Snoeren tells me to go check out some modification to the Atheros driver, and we end up using it in our system, it would hardly qualify him for a coauthorship (though he might get a shout-out in the acknowledgments section of the paper.)

    In systems research, it's not just about sharing ideas; the sustained effort to produce and evaluate a working implementation that largely defines one's contribution. Of course, it would be sweet if someone commenting regularly on the blog ended up making a lot of good suggestions - then I would consider adding them as a coauthor. Of course we could also open up the working code under SVN and make the whole project "open", but that's not quite what I had in mind.

  6. Matt - I've been having some luck with it as well; I've got a blog up for my low-power computing class where I throw little news tidbits and "for-interest only" readings. I like the format -- much easier to toss things there than to update the syllabus, etc. I'd love to hear how your experiment goes this semester. I know that Srini Seshan and Nick Feamster successfully had a cross-course blog about a year ago where they found paper reading overlap between their courses and integrated the commenting. I think they were pleased with the result, though it took a fair bit of work.

  7. Many institutions limit access to their online information. Making this information available will be an asset to all.

  8. A great constructive article will help to understand the issue.

  9. some time ago I heard something like this, allegedly a super computer with nanotechnology read all the post, chat rooms, mails, and all the write information in Internet, this with the propose to learn about our think way.

  10. I like this, because once I start to research on something and that became a bussnies I'm kind of succefull on what I do now, I used to be a programmer, but with this idea I got Its better I earn the same as a programmer and It's stable.

    Thank you.