Tuesday, March 31, 2009

Programming Haiku

Today in my graduate course we discussed the Berkeley snlog system. This is a declarative programming language for sensor nets, based on Datalog, and derived from the authors' previous work on P2.

Mike Lyons, one of the Ph.D. students in the course, made an interesting observation. He pointed out that programming in these very concise, domain-specific languages is like "writing haiku." The idea behind these various languages is to make programming sensornets simpler by abstracting away details and making the code tighter and easier to grok. Mike observed that he'd "rather write 10 pages of C code than 10 lines of Datalog," which is a very good point -- and one that speaks to a number of projects that equate "fewer lines of code" with "ease of programming."

A number of projects, including several of my own (Flask, Regiment), have focused on simplifying programming for sensor nets. The idea is that building up complex distributed behaviors from low-level programs, usually implemented in C, is too painful for non-experts. The main challenge is grappling with the complex dynamics that arise when you have uncertain environmental conditions, message loss, node failure, and so forth. This is difficult in conventional distributed systems, and even more painful in the extremely resource-challenged domain of sensor nets. To this end, there has been a slew of sensor net language papers, ranging from simple node-based languages to more ambitious, network-wide programming environments. For those without the patience to run them all down, Luca Mottola's PhD thesis does a great job at surveying the field.

Evaluating a language paper, especially in a relatively new domain, is often challenging. In sensor nets, CPU performance is not a good metric -- often we care more about energy consumption, communication overheads, and robustness to failure. Evaluating "ease of programming" is much more difficult. The most common quantitative measure is lines of code, but this is potentially misleading. As Mike pointed out today, the more compact a language is, the more important it is that every line of code is absolutely right -- there's less wiggle room or flexibility to explore alternative ways of implementing the same thing.

Peter Dinda's group at Northwestern has done an actual user study (gasp!) of a simple, BASIC-based language for sensor nets. While this is far better methodology, I'm worried that such an approach is too heavily biased towards absolute neophyte programmers, and this tells us very little about how effective the language will be for writing real applications. After all, most domain scientists who want to employ sensor nets are already programmers (though often in languages like MATLAB). So what is good for a user study may not be good for actual users.

So maybe we need to get away from "ease of programming" as our key metric, and focus on what domain experts really need to leverage sensor nets. It's clear that people can learn a new language if it will help them get their work done. In the end, the language is probably a lot less important than the mental abstraction required to capture a given program.

8 comments:

  1. Hi Matt, I agree with you about using program abstraction as a central organizing concept when looking to simplifying domain programming, just as data abstraction is used to simplify programming algorithms.

    In Kairos, for example, we were trying to push macroprogramming (a term which I believe you coined) to the extreme: the programming abstraction is a matlab interface to distributed sensors, which can be thought of as associative data arrays. The particular language constructs one uses to realize this abstraction and expose relevant resource constraints seemed to follow naturally.

    Using "lines of code" as an indirect metric to capture the effectiveness of abstraction has both pros and cons. On the flip side, several studies have shown that the average bug density defined as "bugs per source code lines" is roughly constant regardless of the language used (see "On Lisp" by Paul Graham, for e.g.). This observation suggests that shorter programs are in some sense better from a reliability perspective.

    The downside is that languages may be so restrictive that the programmer may find it hard or even impossible to capture the domain constraints that the language designer hasn't thought about initially.

    To summarize, I think we have barely scratched the surface of what good program abstractions should look like, and where the trade-off lie. Good data abstractions and control abstractions took a while to shake out, so may be this is the right time for experimenting with program abstractions for distributed systems.

    ReplyDelete
  2. Hi Matt:

    Fun post, I like Mike's analogy! Though if you ask me, programming in SNLog is more like writing a Hemingway short story: wide range of expression, but says no more or less than is needed to get it across. :-) (Which makes C what? Pynchon?)

    I do think measuring LOCs can be useful, but only when the difference is measured in orders of magnitude: OOMLOCs. 2-3 OOMLOCs is enough difference that you've probably distilled the idea to the point where you're thinking differently. As you say, Matt, the mental abstraction changes.

    I think Overlog/SNLog gets to that point, though I'm no fan of the datalog syntax, and we still have work to do cleaning up some semantic issues. We're focusing on that more now. Hopefully more to report in a bit.

    Beyond LOCs and OOMLOCs are questions like: What can you prove about your program? How malleable is it (evolution, reuse, interposition)? How robust is it to bugs, failures, adversaries? Ease of debugging/monitoring? How easy is it to get started, and build out the code incrementally? This latter is maybe what Mike is getting at with the Haiku analogy.

    I agree with Krishna (and you, I presume!) that programming models are a key issue for distributed systems research, and we all have a lot left to do. Let's keep the conversation going!

    ReplyDelete
  3. Hi Joe,

    You have a good point that it's hard to evaluate the effectiveness of a language until you've actually written substantial programs in it. Sadly, this rarely comes across in the 14 pages of a conference submission. So often there is a knee-jerk reaction to a language design based on the snippets and examples in the paper. (Of course, syntax matters way too much to most people. The MIT guys who did WaveScript were smart to use a C-like syntax, rather than the funky functional syntax we used in Regiment.)

    One question I had about the snLog paper, while we're at it. As far as I can tell the paper never described how communication was done. Presumably tuples on neighboring nodes that are involved in a rule evaluation need to be broadcast somehow. Does a node evaluating a rule pull those tuples from its neighbors, or do the neighbors push them out? If so, when do you transmit tuples - any time a rule fires, or only when a rule fires that might affect the state of another node?

    Can a programmer reason about the communication overhead that their set of rules will induce?

    ReplyDelete
  4. Matt:
    I can answer for Overlog in P2 and in our new Java-based runtime, since I've hacked on both. David Chu is the authority for SNLog and DSN; I never dove into that code and I know he made some decisions differently than in the Internet-based cases.

    To talk about "when" in these declarative languages, you need to think of a clock as being just more data. Virtually, a clock is an infinite source of "tick" data. If you want things to happen on a particular clock tick (or on periodically-occuring clock ticks), you join with a timer or periodic table provided by the system. From a logic perspective, this means that individual facts are associated with certain clock ticks (i.e., timestamped). Now, in execution, logical clock ticks are "delivered" at the corresponding local wallclock time, so the timestamping happens at just the right time -- i.e. join tuples enter the dataflow executor at those wallclock times and generate their consequences through the rules. You can see the DSN tutorial for examples, or the P2 tutorial. Search for 'periodic' and 'timer'.

    As to "push" vs. "pull", at the lowest level it's all push; you can build up from there. (E.g. "push" a request for data.) In DSN, send is implemented via typical NesC comm modules. There are multiple variants I believe -- I know David is currently playing with low-power-listening, for example, and previously played with tradeoffs between unicast and multicast (now in his IPSN '09 paper). So the specifics of how uni/multi/broadcast are implemented are presumably configurable.

    Now as to reasoning about comm overhead. You can do it pretty simply if you write simple rules and don't turn on any fancy optimizations. However, as David shows in the IPSN 09 paper (and he's got more coming), you may well want those optimizations because they can lower your comm overhead. This gets to a classical philosophical question about compilation vs. hand-coding. With ever-increasing network complexity, should hackers be hard-coding protocols, or should protocols be synthesized and self-modified to adapt to network and traffic properties?

    If you want to explore protocol compilation as we are, you let the compiler and the protocols try to minimize comm overheads for you. That doesn't mean you can't observe and explain the resulting behavior -- it's easy to synthesize monitoring/logging code to accompany declarative programs, and explain those comm logs via the optimized "plan" generated by the compiler (a "plan" is a dataflow program a la MapReduce; not hard to understand). Then in principle you can tweak those plans if you understand the situation better than the compiler.

    ReplyDelete
  5. Hi Matt,

    The basic runtime for dsn is all about pushing tuples. The dsn we've released has communication overhead of 1 byte for the tuple id -- we could have overloaded AM as tuple id to eliminate this overhead. Other packet-level overheads can be controlled via table schema/typedef declarations.

    As you suggest, a natural question to ask is about a pull approach to tuple generation. This is essentially the topic of the IPSN paper Joe mentions. It turns out there is an algorithm from the database literature, magic sets, which converts push to pull based on the user's query. We map this technique to the networked setting, along the way popping out a hybrid push+pull. The hybrid push+pull turns out to be pretty useful for flexibly choosing a good rendezvous/proxy for communication.

    -david

    ReplyDelete
  6. David,

    Thanks, that's helpful. I am still confused about when tuples are "pushed". Is it each time a rule fires on a node? This would imply that each rule firing requires a radio message transmission, which is probably not what you want. Or do you only push tuples that you know might affect rule evaluation on other nodes?

    Matt

    ReplyDelete
  7. well the programming is one of the most importanr things on the new world and most of the gambles need it. So if you know about it you would be earn a lot of money.

    ReplyDelete

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.