Monday, March 12, 2012

Do you need a PhD?

Since I decamped from the academic world to industry, I am often asked (usually by first or second year graduate students) whether it's "worth it" to get a PhD in Computer Science if you're not planning a research career. After all, you certainly don't need a PhD to get a job at a place like Google (though it helps). Hell, many successful companies (Microsoft and Facebook among them) have been founded by people who never got their undergraduate degrees, let alone a PhD. So why go through the 5-to-10 year, grueling and painful process of getting a PhD when you can just get a job straight out of college (degree or not) and get on with your life, making the big bucks and working on stuff that matters?

Doing a PhD is certainly not for everybody, and I do not recommend it for most people. However, I am really glad I got my PhD rather than just getting a job after finishing my Bachelor's. The number one reason is that I learned a hell of a lot doing the PhD, and most of the things I learned I would never get exposed to in a typical software engineering job. The process of doing a PhD trains you to do research: to read research papers, to run experiments, to write papers, to give talks. It also teaches you how to figure out what problem needs to be solved. You gain a very sophisticated technical background doing the PhD, and having your work subject to the intense scrutiny of the academic peer-review process -- not to mention your thesis committee.

I think of the PhD a little like the Grand Tour, a tradition in the 16th and 17th centuries where youths would travel around Europe, getting a rich exposure to high society in France, Italy, and Germany, learning about art, architecture, language, literature, fencing, riding -- all of the essential liberal arts that a gentleman was expected to have experience with to be an influential member of society. Doing a PhD is similar: You get an intense exposure to every subfield of Computer Science, and have to become the leading world's expert in the area of your dissertation work. The top PhD programs set an incredibly high bar: a lot of coursework, teaching experience, qualifying exams, a thesis defense, and of course making a groundbreaking research contribution in your area. Having to go through this process gives you a tremendous amount of technical breadth and depth.

I do think that doing a PhD is useful for software engineers, especially those that are inclined to be technical leaders. There are many things you can only learn "on the job," but doing a PhD, and having to build your own compiler, or design a new operating system, or prove a complex distributed algorithm from scratch is going to give you a much deeper understanding of complex Computer Science topics than following coding examples on StackOverflow.

Some important stuff I learned doing a PhD:

How to read and critique research papers. As a grad student (and a prof) you have to read thousands of research papers, extract their main ideas, critique the methods and presentation, and synthesize their contributions with your own research. As a result you are exposed to a wide range of CS topics, approaches for solving problems, sophisticated algorithms, and system designs. This is not just about gaining the knowledge in those papers (which is pretty important), but also about becoming conversant in the scientific literature.

How to write papers and give talks. Being fluent in technical communications is a really important skill for engineers. I've noticed a big gap between the software engineers I've worked with who have PhDs and those who don't in this regard. PhD-trained folks tend to give clear, well-organized talks and know how to write up their work and visualize the result of experiments. As a result they can be much more influential.

How to run experiments and interpret the results: I can't overstate how important this is. A systems-oriented PhD requires that you run a zillion measurements and present the results in a way that is both bullet-proof to peer-review criticism (in order to publish) and visually compelling. Every aspect of your methodology will be critiqued (by your advisor, your co-authors, your paper reviewers) and you will quickly learn how to run the right experiments, and do it right.

How to figure out what problem to work on: This is probably the most important aspect of PhD training. Doing a PhD will force you to cast away from shore and explore the boundary of human knowledge. (Matt Might's cartoon on this is a great visualization of this.) I think that at least 80% of making a scientific contribution is figuring out what problem to tackle: a problem that is at once interesting, open, and going to have impact if you solve it. There are lots of open problems that the research community is not interested in (c.f., writing an operating system kernel in Haskell). There are many interesting problems that have been solved over and over and over (c.f., filesystem block layout optimization; wireless multihop routing). There's a real trick to picking good problems, and developing a taste for it is a key skill if you want to become a technical leader.

So I think it's worth having a PhD, especially if you want to work on the hardest and most interesting problems. This is true whether you want a career in academia, a research lab, or a more traditional engineering role. But as my PhD advisor was fond of saying, "doing a PhD costs you a house." (In terms of the lost salary during the PhD years - these days it's probably more like several houses.)


16 comments:

  1. VoilĂ :
    http://dl.dropbox.com/u/265383/office_wall.jpg

    :)

    ReplyDelete
  2. Some people tend to be born designers, aka, scientists. This does not mean they know how to get done what they propose, hence all the debate about bad coding practices in academia. Some people are born implementers; they can't innovate, yet know how to do their job, just like a pilot knows how to fly a plane but can not necessarily propose, for example, to use an alternate metal for planes to help them fly faster. Rarely, a person is both a good designer and implementer. One should do an honest self appraisal and then decide what will utilize their strengths in the best possible way.

    ReplyDelete
  3. I don't think it's currently true (in the US, in the year 2012, in CS or EE) that "doing a PhD costs you a house". In the past couple of decades the market has priced the starting salary of a strong PhD well above the starting salary with an MSc. In the long run this makes up any lost income difference. Still, I agree that one is not doing a Ph.D. for the money, but my point is that the financial cost is not what it used to be in the 90s.

    ReplyDelete
  4. I appreciate your analysis, I think you give a good outline of what a someone considering a PhD can hope to learn, and how it will benefit them. While I think many people come out of top tier PhD programs with these skills, it is certainly not assured, and quite a few people coming out of those programs were very good going in. Are you sure there is as much causality when it comes to giving good talks, methodical science, or identifying good problems? That is, given a top flight candidate for top PhD programs, how much better will they really be on the other side?

    Of course, the PhD can also be a useful signal, even if it doesn't improve a given person that much. But how many places still respect that signal today, and how many in 5-10 years? If the best candidates are leaving academia for Google/Facebook/etc, are the people staying through the PhD going to be eroding the value of that signal?

    ReplyDelete
  5. As someone who was always interested in getting a PhD but deterred or unable to do so for various reasons, I am curious about how I can acquire the disciplined and principled approach to research. Are there any tips to get started?

    ReplyDelete
  6. Thanks a lot ... your post was really helpful for me ... :)

    ReplyDelete
  7. If someone wanted to learn those four important things you outlined but wasn't interested in going through a PhD program, what would you suggest to them?

    ReplyDelete
  8. Anon @12:15
    "I don't think it's currently true (in the US, in the year 2012, in CS or EE) that "doing a PhD costs you a house". In the past couple of decades the market has priced the starting salary of a strong PhD well above the starting salary with an MSc."

    Where I work (competitive firm), Masters start around 100K and PhDs around 125K. The delta is usually made up in 3-4 years by most MS graduates. When you factor in lost equity and lost salary for 3-4 years, I don't think it works out. I don't think you lose a house. You lose about 100-150K in Savings which in the Bay Area is just downpayment. Of course, you could be losing equity in a hot shot startup, in which case you have lost several houses. But, on balance, you just tend to lose 100-150K if the economy is what it was in the 2000s when nothing except gold went up (and you didn't invest in gold).

    ReplyDelete
  9. Your claim that the research community isn't interested in building an OS kernel in Haskell is just not true. These folks: http://hasp.cs.pdx.edu/ build the House system, which was just that, and are now building a new variant of Haskell, called Habit, to continue that work.

    ReplyDelete
  10. In terms of the lost salary during the PhD years - these days it's probably more like several houses.

    Hmmm.... since when regular software developers are getting 400-500K per year? :-)

    ReplyDelete
  11. "...having your work subject to the intense scrutiny of the academic peer-review process"

    I strongly suspect that was just a ironic sentence you inserted in your last post to see if someone could pick it out...
    Whoever has been somehow involved in the peer-review process (of both top CS conferences and journals) knows that process is just broken.
    What I always tell to my Ph.D. students is that the only good thing of the current review process is that it teaches them to fight against some of the typical unfair situations they will inevitably have to face during their lives.
    I repeat: that's the only thing.
    I add other two observations:
    - most of the faculty heavily involved in the review process would agree with what I wrote above;
    - most of the faculty heavily involved in the review process would admit to take personal advantage of the current situation (that's why the system will never change for the better).

    ReplyDelete
  12. Anon re: peer-review process: I have no illusions that the peer-review process has problems and could use some fixing. However, I think you will agree, that at least at top conference venues, it does involve "intense scrutiny" as I said. That does not mean that it is always fair or that the outcome is always right. There is a huge difference between having your work reviewed by a program committee at a top conference and, say, writing up stuff in blogs, or even having your peers at work review it. The bar for publication at a good conference is incredibly high. So I stand by the statement.

    ReplyDelete
  13. "The bar for publication at a good conference is incredibly high"

    By the way, I was referring only to TOP CS conferences and journals (I never mentioned blogs or other things...). That the bar is very high, we all know that.
    Unfortunately, the technical merit of a paper is not always the only measure the PCs use to assess if a paper is worth or not climbing over that bar (and we all know that as well).
    Thus, there's no point in having an "intense scrutiny", if that is not fair.
    As long as the current system will be stubbornly defended, instead of clearly corrected, by the same ones that are so much involved in it, things will always be the same.

    ReplyDelete
  14. Anon: I don't know what you're talking about - you should be specific. You're posting anonymously and using broad generalisms to describe a complex system with a lot of variation across venues. Sounds like you've had a bad experience with program committees not accepting papers that you deem to be worthwhile. I'm not going to say that the peer review process is perfect, but I don't think it's as hopelessly broken as you make it out to be. Mistakes happen but by and large I think the system "works", as inefficient at it can be for program committees.

    My post had nothing to say about whether the system was "fair." Regardless of that, I do believe that having a PhD student run the gauntlet of the peer review process teaches you a lot. Getting a paper accepted is a distant second to having gone through the process of trying.

    ReplyDelete

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.