Saturday, September 10, 2011

Programming != Computer Science

I recently read this very interesting article on ways to "level up" as a software developer. Reading this article brought home something that has been nagging me for a while since joining Google: that there is a huge skill and cultural gap between "developers" and "Computer Scientists." Jason's advice to leveling-up in the aforementioned article is very practical: write code in assembly, write a mobile app, complete the exercises in SICP, that sort of thing. This is good advice, but certainly not all that I would want people on my team spending their time doing in order to be true technical leaders. Whether you can sling JavaScript all day or know the ins and outs of C++ templates often has little bearing on whether you're able to grasp the bigger, more abstract, less well-defined problems and be able to make headway on them.

For that you need a very different set of skills, which is where I start to draw the line between a Computer Scientist and a developer. Personally, I consider myself a Computer Scientist first and a software engineer second. I am probably not the right guy to crank out thousands of lines of Java on a tight deadline, and I'll be damned if I fully grok C++'s inheritance rules. But this isn't what Google hired me to do (I hope!) and I lean heavily on some amazing programmers who do understand these things better than I do.

Note that I am not defining a Computer Scientist as someone with a PhD -- although it helps. Doing a PhD trains you to think critically, to study the literature, make effective use of experimental design, and to identify unsolved problems. By no means do you need a PhD to do these things (and not everyone with a PhD can do them, either).

A few observations on the difference between Computer Scientists and Programmers...

Think Big vs. Get 'er Done 

One thing that drove me a little nuts when I first started at Google was how quickly things move, and how often solutions are put into place that are necessary to move ahead, even if they aren't fully general or completely thought through. Coming from an academic background I am used to spending years pounding away at a single problem until you have a single, beautiful, general solution that can stand up to a tremendous amount of scrutiny (mostly in the peer review process). Not so in industry -- we gotta move fast, so often it's necessary to solve a problem well enough to get onto the next thing. Some of my colleagues at Google have no doubt been driven batty by my insistence on getting something "right" when they would rather just (and in fact need to) plow ahead.

Another aspect of this is that programmers are often satisfied with something that solves a concrete, well-defined problem and passes the unit tests. What they sometimes don't ask is "what can my approach not do?" They don't always do a thorough job at measurement and analysis: they test something, it seems to work on a few cases, they're terribly busy, so they go ahead and check it in and get onto the next thing. In academia we can spend months doing performance evaluation just to get some pretty graphs that show that a given technical approach works well in a broad range of cases.

Throwaway prototype vs. robust solution

On the other hand, one thing that Computer Scientists are not often good at is developing production-quality code. I know I am still working at it. The joke is that most academics write code so flimsy that it collapses into a pile of bits as soon as the paper deadline passes. Developing code that is truly robust, scales well, is easy to maintain, well-documented, well-tested, and uses all of the accepted best practices is not something academics are trained to do. I enjoy working with hardcore software engineers at Google who have no problem pointing out the totally obvious mistakes in my own code, or suggesting a cleaner, more elegant approach to some ass-backwards page of code I submitted for review. So there is a lot that Computer Scientists can learn about writing "real" software rather than prototypes.

My team at Google has a good mix of folks from both development and research backgrounds, and I think that's essential to striking the right balance between rapid, efficient software development and pushing the envelope of what is possible.

37 comments:

  1. Nice analysis and comparison.

    So, what should the phd students do in graduate school ? I mean, we should continue to "Think big" instead of putting more efforts for better programming skills ?

    I didn't get your point in this article.

    ReplyDelete
  2. The ironic thing is that while PhD students need to "think big" to come up with the context for the problem they solve, the actual problem/solution that constitutes the PhD thesis itself is often very small and incremental. For example, you might be tackling a large, hard problem like "making sensor networks power-efficient" or "making software more robust" but the actual work you do will, at best, be comparable to an industry project in scope (usually much smaller since you're just a single person, unless you work on a large group systems project).

    But that's a bit of a tangent. I don't think Matt was trying to give advice as to whether graduate students should put effort into improving their programming skills.

    ReplyDelete
  3. Anon x 2 - Right, I don't think that you *can* learn to write production-quality code while doing a PhD (unless you do an internship at a place like Google, Amazon, or Microsoft - but even then your exposure is limited). Nor should you - the point of doing a PhD is to do research, not build a real product. If I had a PhD student who spent too much time "productizing" their work I'd say that was not a good use of time; the only thing you need code for during a PhD is as a vehicle to publish papers. Some people (like Eddie Kohler, with Click) do put out very nice software artifacts as the result of their PhD work, but this is the exception rather than the rule.

    ReplyDelete
  4. Just a short note: Along the same lines, IT is not CS too.

    ReplyDelete
  5. As a former PhD student now working on production quality code, I think the main difference is that you have the luxury of not worrying about legacy code when you are doing your PhD. Its an enormous luxury to not worry about customer issues and legacy code. Also, your tools don't have to be approved by management. You can use the right tool for the job instead of using some in house crap because it exists. Its being pigeonholed in this fashion that leads to the Get 'er Done attitude and the downplayment of Big Think. One thing I've noticed is that its easy for Computer scientists to value programmers because more often than not we get stuck with some linker error that won't go away and Programmer X says, "Just define a static" and then magic happens. Companies in general don't seem to explicitly value Computer scientists as much over programmers. Is that your experience? Obviously, Google might be an exception. I ask because as a former academic, you probably push for producing computer scientists whereas what industry really values more is a software engineer. Thoughts?

    ReplyDelete
  6. Anon "former PhD student" - There's a lot of truth in this. One thing that concerns me about many companies, even places like Google, is that long-range thinking and big ideas are not always valued. Of course, Google does some pretty long-range stuff, but this varies a lot by project and product area. There are, of course, good reasons for this - we don't always have the luxury of, say, ripping out a huge piece of infrastructure or redesigning something to conform to some academic ideal. So I think it depends a lot on what kind of team you are on and whether other people are coming from academic and research backgrounds. Google has a lot of PhDs, and so do places like Microsoft and IBM. Many other companies may not have this in their DNA.

    ReplyDelete
  7. Hi Matt,

    As far as the "developer" vs. "scientist" distinction goes, do you think that the Google performance review/leveling process respects the contributions that people with a more scientific/research mindset make, or do you think that they care more about productivity as a developer in the usual sense?

    ReplyDelete
  8. Dear Matt: Some good observations. Like math, there is sometimes inherent beauty in how a problem solved (cf. Dijkstra). The industry/academic comparison is not universal outside of computer science; in fact, at Intel the problems we worked on moved much slower than would be in the academia simply because the stakes to get the right working solution were so much higher, be if a FUB (functional block) design or some circuit characterization.

    OTOH, no offense (to any of us), but is naval gazing an occupational hazard in computer science?!

    ReplyDelete
  9. Anon re: performance review process: I think Google does a pretty good job at this. Of course, Googlers are expected to build real things and release real, working code. But there is plenty of room here for long-range projects. As I said, there are plenty of PhDs here.

    Rajesh - no doubt naval gazing is dangerous -- this is the whole purpose of my blog though :-)

    ReplyDelete
  10. Good point Matt. Volatile and Dangerous!

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. sometimes, I feel it interesting: in the paper review process, the reviewers always require the submissions have general and complete solution on their problems, however, just like you mentioned in the article, the academic guys really are not good at writing production-level codes, so called solution is just described in the paper, but in real world, they are buggy

    ReplyDelete
  13. I have to agree with codingCat. In research there are many things considered implemented and working so "we can safely assume that"... The quoted sentence is of course not true. As a developer who has in my master thesis developed some scientific papers (for the very purpose of evaluating them) I have observed that there are many "safe" assumptions that the code does not agree with. Oh and the scope of Sc.Research is usually very limited (most papers are about a huge problem but solve a specific sub-category). The real purpose of scientific discovery in production enviroments is a best solution for a (seemingly) small problem such as with NP problems approximation and online algorithms, where the best (yet) solution is purely mathematical an d can be implemented and used in production.

    ReplyDelete
  14. Good analysis. I noted that you seemed to use the CS and SE designations interchangably. Are the SE trained students closer to the expert practicioners than the CS students are? For my part as a Prof. of Information Technology with a background(MS) in CS, I'm very aware that my own training is deficit in this way and try to overcome that in my teaching practice. We see our discipline (IT) as applied CS and SE.

    Peace

    ReplyDelete
  15. "We need to do away with the myth that computer science is about computers. Computer science is no more about computers than astronomy is about telescopes, biology is about microscopes or chemistry is about beakers and test tubes." Fellows, M.R., and Parberry, I.

    ReplyDelete
  16. This comment has been removed by the author.

    ReplyDelete
  17. Maybe you already read it but in any case here is a related article by a your colleague (whom I admire): Misko Hevery --> http://misko.hevery.com/2009/07/11/computer-engineer-vs-computer-scientist/

    ReplyDelete
  18. "What they sometimes don't ask is 'what can my approach not do?'"

    While this is a valuable perspective, I think we still need to ask if it's always relevant. Certainly there's value in a comprehensive solution that works in every case and covers a broad range of problems, but at what cost? Such academic solutions are often much more complex and lead to issues with readability and maintainability. It's great to have someone looking at the big picture, but not every problem has a bigger picture. Sometimes a clear and straightforward engineering solution is all you really need.

    ReplyDelete
  19. Dear Matt: I view students from NUST in Zimbabwe to be better.

    How do your programmers become better?
    where did they learn databases, documentation,efficiency and all of this stuff i wonder.
    Surely not by certification because you will need 1) Java Programmer certificate
    2) SQL Server certificate
    3) Documentation certificate
    for just on one person to satisfy what you are talking about
    As for UML, you will need to read them alone cz i dont think there are any certificates on this

    The problem with todays systems is that they are now interdependent and these problems need a computer scientist.

    Programmers are a burden to the company in that the company has to pay for inefficiencies for their learning and lack of adaptability.

    Now can you please go to their offices and ask them where they learnt what i mentioned above? Im guessing a computer scientist working there taught them.

    Also research on material certifications provide
    im expecting a reviewed report

    thanks

    tawanda chinaka

    ReplyDelete
  20. At least its clear that programmers and comp. sci.s are different (of course they're different, but non-techie personnel don't know this stuff which is make life harder), though everyone can be both :D

    Good article! Keep it up!

    ReplyDelete
  21. "Think Big vs. Get 'er Done"

    I prefer to see this as "Plan vs Make it Happen"
    Both of these roles are needed, and neither no less important than the other. As developers, we take those plans @ 10,000 feet and bring them down to reality and make them work. It is really hard for the majority of the time to see a forest if your standing in it.

    "What they sometimes don't ask is 'what can my approach not do?'"
    There is actually one other side here.
    "What are they going to try and do with this?"

    ReplyDelete
  22. Good points.

    Computer science is engineering, programming is construction. Both are needed: you don't want a welder to design your bridge, and you don't want an engineer to put it together for you.

    Most experienced programmers/computer scientists have done both, but few do both well and most end up doing one or the other most of the time. There are far more programmers than computer scientists, but they are also needed in larger numbers.

    What I find irritating are programmers who fancy themselves computer scientists, by talking about theory all day long and designing (but never completely coding) unecessary, over-engineered frameworks to solve non-existent problems, just to show off. If you're hired to be a programmer, be a programmer, and do the work people are depending on you for. Otherwise, a junior programmer fresh out of school is more useful.

    ReplyDelete
  23. Thanks dude,

    i was too in dilemma for this 'programming vs invention' things since 6-7 years.

    This helps me to rethink/restructure my career and life goal.

    Programmers are mare a user, while inventors forced them to use those.

    ReplyDelete
  24. I always try to be generous when I say I must have been attending college computer programming classes at a bad patch. Since I already knew how to program, I was taking classes in pursuit of that piece of paper. I had to drop out after a while, I couldn't stand the blatantly bad programming methodologies that were being taught. I was the class debugger and found myself debugging the programs of the professors teaching the class. So I totally agree that Phd's are not necessarily computer scientists. Although I was more worried about my classmates who were going to go out into the world thinking they knew how to program.

    ReplyDelete
  25. computer science has lots of option other than programming.

    Mr.Matt i am fully agree with you.this blog may change something in computer science.

    ReplyDelete
  26. Hello Matt, the title could be somehow misleading or very eye catching, depending on your POV. However, I find your are describing approaches rather than clearly defined roles. The approaches are targeted to one task: writing programs.

    I find this sort of scenery very interesting. The fact that the approaches differ I believe it's related to the nature of where the task is being done. Industry has different constraints and goals than academia.

    As a subject that is on my interest I do agree than a richer interaction between both could benefit them, as you conclude and have experienced yourself at Google.

    regards.

    ReplyDelete
  27. Although I appreciate the perspective and insight a computer scientist can bring, I've had difficulty with them on my teams in the past. Generally speaking, problems have arisen around over-architected, overly complex OO solutions that are colored by some theory or approach that they heard in a classroom somewhere. To some extent, ignorance is bliss. Coders who aren't familiar with more esoteric theoretical concepts that have no real bearing on producing software are not limited by this mental box.

    ReplyDelete
  28. I think this is probably a case of square pegs vs. round holes.
    I know a lot of people that have computer science degrees, and I usually have to fight them as they try to redesign my frameworks.
    On the other hand, I know programmers who will not change an obviously wrong or horrible piece of code in the module they were assigned to debug... a classic case of doing only what the bug report said.
    I think the best will be able to do BOTH... after they get enough seasoning, of course ;-)

    ReplyDelete
  29. Lots of good comments here.

    A couple of comments point out that Computer Scientists can sometimes over-engineer solutions to problems. I agree with this entirely. I have this tendency myself, and one thing I have to keep reminding myself of at Google is that this isn't about writing papers, or impressing my academic friends: it's about getting shit done. Still, there's a fine line between that and cutting corners that really should not be cut.

    Anyway, I didn't mean this post to be yet another ontological debate about "CS versus engineering". The point was that there are things that both self-styled "computer scientists" and "programmers" can learn from each other in terms of setting priorities, generalizing solutions, and building real code.

    ReplyDelete
  30. I believe that the difference between a Computer Scientist and a programmer is the amount of math one knows. For example, if someone has _NEVER_ even heard of Markov chains then he/she is not educated in CS.

    You may still be the best programmer the world has ever seen.

    There is very little correlation between good programming and CS as you rightly point out!

    ReplyDelete
  31. Computer scientists are scientist first and foremost. They just happen to study algorithms and the different ways data can be arranged.
    Much like material scientists that study the various physical properties of their chosen materials.
    Software engineers are engineers first and foremost. They design solutions to customer problems involving software based systems.
    Programmer are the construction workers of the software industry. They take the software engineers specifications and turn it into something the customer can use.
    I wouldn't let a pure computer scientist with no other experience any closer to a customers system than I would let a materials scientist near a civil construction project.

    As pointed out earlier, these are different roles requiring different skill sets. The lines get slightly blurred in the IT industry because there is less regulation as far as what your qualification entitles you to do professionally than there is in the physical engineering fields.

    IT workers take advantage of this to shortcut into new areas in a way they simply would not be allowed to do in civil or electrical engineering.

    For example: Broadly speaking, in the electricity industry (In Australia) you are not allowed to create any form of "new" physical circuit (Above 50V AC IIRC) if you are not a qualified electrician (trade qualified); you are not allowed to perform any electrical calculation or circuit design, unless you have an electrical AD (Para-professional) and even then it has to be checked and approved by a fully qualified electrical engineer; You are not allowed to provide an engineering service (design) unless you have completed at least a four year electrical engineering degree.
    Please Note: Engineers are not electricians and are therefore not allowed to create new physical circuits! (Above 50V AC) Different qualifications are needed because they are hugely different skill sets.
    There are some caveats on these and I have simplified the qualifications the engineer needs, but this is basically what the legislation says.
    There is no such regulation on the software side of the IT industry. As a result we get the equivalent of enthusiastic amateurs trying to pass themselves of as engineers (A serious offence in the electrical industry), or the equivalent of para-professional passing themselves off as computer scientists.

    Sometimes people actually have multiple skill sets. I know a few Electrical Engineers that started in the industry by completing an electricians apprenticship and studied part time, working thier way up over 10 years or more. They are hugely valuble in any role, but they are not that common.
    On the otherhand, in the IT industry I know heaps of people that did some basic TAFE programming courses and now call themselves software or computer engineers, at least until I corrected their world view.

    ReplyDelete
  32. You are right, after having spent 4 years as a computer science undergraduate and almost close to graduation, i felt like an idiot to have gone to school to study computer science, because being a developer is quite different from being a programmer and what it takes to be a developer are not being taught in school, I mean principle and concepts like clean code, tdd, subversioning, patterns and practices, I am seriouslly learning and want to be a good developer

    ReplyDelete
  33. i would just like to quote Steve jobs on this..."Real artists ship"...and that's what matters in a company..but you are true as far as academics go..its really important to study and question an existing process or model that's bad but companies dont always have that liberty..

    ReplyDelete
  34. Susan LoVerso and Margo Seltzer wrote a nice article about some of the issues mentioned here "Tree Houses and Real Houses: Research and Commercial Software". We read this as students and I still go back to it every now and then.

    ReplyDelete
  35. Valuable Article.. Great amount of good comments.. My 2 cents..
    Science : Exploring Truth- Discovering Truth- Verifying Truth - Creating Knowledge - Solve a general problem
    Engineering: Use Science - Apply the solution towards specific problems. Combine multiple solutions to solve a bigger custom problem.
    When there are no known solutions , Engineer becomes a scientist, and when there are solutions to be reused for a bigger problem, Scientist becomes an engineer. Both Inductive (Generalization) and Deductive (Specification) Reasoning need to be there for both the two but the percentage of distribution varies. For an Engineer, it is 70% Deductive + 30% Inductive
    whereas for a scientist it is the other-way round.

    ReplyDelete
  36. Also, not every scientist can develop good products.

    ReplyDelete
  37. I think I agree with Matt a lot. Although I only have Bachelor degree in CS, but I think of myself more of Computer scientists, rather than programmer. Often times, I had to deal with my manager's "Get it done quick and fast" attitude, while I tried to take a longer view and give more thoughts about how to solve a problem. That seemed to annoy him a lot. And, he was a PhD student at that time :)

    ReplyDelete