Thursday, July 11, 2013

Does the academic process slow innovation?

I've been wondering recently whether the extended, baroque process of doing research in an academic setting (by which I mean either a university or an "academic style" research lab in industry) is doing more harm than good when it comes to the pace of innovation.

From http://academicnegativity.tumblr.com/
Prior to moving to industry, I spent my whole career as an academic. It took me a while to get used to how fast things happen in industry. My team, which is part of Chrome, does a new major release every six weeks. This is head-spinningly fast compared to academic projects. Important decisions are made on the order of days, not months. Projects are started up and executed an order of magnitude faster than it would take a similarly-sized academic research group to get up to speed.

This is not just about having plenty of funding (although that is part of it). It is also about what happens when you abandon the trappings of the academic process, for which the timelines are glacial:
  • A three month wait (typically) to get a decision on a conference submission, during which time you are not allowed to submit similar work elsewhere.
  • A six month wait on hearing back on a grant proposal submission.
  • A year or more wait for a journal publication, with a similar restriction on parallel submissions.
  • Five plus years to get a PhD.
  • Possibly one or two years as a postdoc.
  • Six to eight years to get tenure.
  • A lifetime of scarring as the result of the above. (Okay, I'm kidding. Sort of.)
This is not a problem unique to computer science of course. In the medical field, the average age at which a PI receives their first NIH R01 grant is 44 years. Think about that for a minute. That's 23-some-odd years after graduation before an investigator is considered an "independent" contributor to the research field. Is this good for innovation?

Overhead

Part of the problem is that the academic process is full of overheads. Take a typical conference program committee for example. Let's say the committee has 15 members, each of whom has 30 papers to review (this is pretty average, for good conferences at least). Each paper takes at least an hour to review (often more) - that's the equivalent of at least 4 work days (that is, assuming academics work only 8 hours a day ... ha ha!). Add on two more full days (minimum) for the program committee meeting and travel, and you're averaging about a full week of work for each PC member. Multiply by 15 -- double it for the two program co-chairs -- and you're talking about around 870 person-hours combined effort to decide on the 25 or so papers that will appear in the conference. That's 34 person-hours of overhead per paper. This doesn't count any of the overheads associated with actually organizing the conference -- making the budget, choosing the hotel, raising funds, setting up the website, publishing the proceedings, organizing the meals and poster sessions, renting the projectors ... you get my point.

The question is, does all of this time and effort produce (a) better science or (b) lead to greater understanding or impact? I want to posit that the answer is no. This process was developed decades ago in a pre-digital era where we had no other way to disseminate research results. (Hell, it's gotten much easier to run a program committee now that submissions are done via the web -- it used to be you had to print out 20 copies of your paper and mail them to the program chair who would mail out large packets to each of the committee members.)

But still, we cling to this process because it's the only way we know how to get PhD students hired as professors and get junior faculty tenured -- any attempt to buck the trend would no doubt jeopardize the career of some young academic. It's sad.

How did we get here?

Why do we have these processes in the first place? The main reason is competition for scarce resources. Put simply, there are too many academics, and not enough funding and not enough paper-slots in good conference venues. Much has been said about the sad state of public funding for science research. Too many academics competing for the same pool of money means longer processes for proposal reviews and more time re-submitting proposals when they get rejected.

As far as the limitation on conferences goes, you can't create more conferences out of thin air, because people wouldn't have time to sit on the program committees and travel to all of them (ironic, isn't it?). Whenever someone proposes a new conference venue there are groans of "but how will we schedule it around SOSP and OSDI and NSDI and SIGCOMM?!?" - so forget about that. Actually, I think the best model would be to adopt the practice of some research communities and have one big mongo conference every year that everybody goes to (ideally in Mexico) and have USENIX run it so the scientists can focus on doing science and leave the conference organization to the experts. But I digress.

The industrial research labs don't have the same kind of funding problem, but they still compete for paper-slots. And I believe this inherently slows everything down because you can't do new research when you have to keep backtracking to get that paper you spent so many precious hours on finally published after the third round of rejections with "a strong accept, two weak accepts, and a weak reject" reviews. It sucks.

Innovative != Publishable

My inspiration for writing this post came from the amazing pace at which innovation is happening in industry these days. The most high-profile of these are crazy "moon shot" projects like SpaceX23andme, and Google's high-altitude balloons to deliver Internet access to entire cities. But there are countless other, not-as-sexy innovations happening every day at companies big and small, just focused on changing the world, rather than writing papers about it.

I want to claim that even with all of their resources, had these projects gone down the conventional academic route -- writing papers and the like -- they would have never happened. No doubt if a university had done the equivalent of, say, Google Glass and submitted a MobiSys paper on it, it would have been rejected as "not novel enough" since Thad Starner has been wearing a computer on his head for 20 years. And high-altitude Internet balloons? What's new about that? It's just a different form of WiFi, essentially. Nothing new there.

We still need to publish research, though, which is important for driving innovation. But we should shift to an open, online publication model -- like arXiv -- where everything is "accepted" and papers are reviewed and scored informally after the fact. Work can get published much more rapidly and good work won't be stuck in the endless resubmission cycle. Scientists can stop wasting so much time and energy on program committees and conference organization. (We should still have one big conference every year so people still get to meet and drink and bounce ideas around.)  This model is also much more amenable to publications from industry, who currently have little incentive to run the conference submission gauntlet, unless publishing papers is part of their job description. And academics can still use citation counts or "paper ratings" as the measure by which hiring and promotion decisions are made.

Wednesday, May 15, 2013

What I wish systems researchers would work on

I just got back from HotOS 2013 and, frankly, it was a little depressing. Mind you, the conference was really well-organized; there were lots of great people; an amazing venue; and fine work by the program committee and chair... but I could not help being left with the feeling that the operating systems community is somewhat stuck in a rut.

It did not help that the first session was about how to make network and disk I/O faster, a topic that has been a recurring theme for as long as "systems" has existed as a field. HotOS is supposed to represent the "hot topics" in the area, but when we're still arguing about problems that are 25 years old, it starts to feel not-so-hot.

Of the 27 papers presented at the workshop, only about 2 or 3 would qualify as bold, unconventional, or truly novel research directions. The rest were basically extended abstracts of conference submissions that are either already in preparation or will be submitted in the next year or so. This is a perennial problem for HotOS, and when I chaired it in 2011 we had the same problem. So I can't fault the program committee on this one -- they have to work with the submissions they get, and often the "best" and most polished submissions represent the most mature (and hence less speculative) work. (Still, this year there was no equivalent to Dave Ackley's paper in 2011 which challenged us to "pledge allegiance to the light cone.")

This got me thinking about what research areas I wish the systems research community would spend more time on. I wrote a similar blog post after attending HotMobile 2013, so it's only fair that I would subject the systems community to the same treatment. A few ideas...

Obligatory diisclaimer: Everything in this post is my personal opinion and does not represent the view of my employer.

An escape from configuration hell: A lot of research effort is focused on better techniques for finding and mitigating software bugs. In my experience at Google, the vast majority of production failures arise not due to bugs in the software, but bugs in the (often vast and incredibly complex) configuration settings that control the software. A canonical example is when someone bungles an edit to a config file which gets rolled out to the fleet, and causes jobs to start behaving in new and often not-desirable ways. The software is working exactly as intended, but the bad configuration is leading it to do the wrong thing.

This is a really hard problem. A typical Google-scale system involves many interacting jobs running very different software packages each with their own different mechanisms for runtime configuration: whether they be command-line flags, some kind of special-purpose configuration file (often in a totally custom ASCII format of some kind), or a fancy dynamically updated key-value store. The configurations are often operating at very different levels of abstraction --- everything from deciding where to route network packets, to Thai and Slovak translations of UI strings seen by users. "Bad configurations" are not just obvious things like syntax errors; they also include unexpected interactions between software components when a new (perfectly valid) configuration is used.

There are of course tools for testing configurations, catching problems and rapidly rolling back bad changes, etc. but a tremendous amount of developer and operational energy goes into fixing problems arising due to bad configurations. This seems like a ripe area for research.

Understanding interactions in a large, production system: The common definition of a "distributed system" assumes that the interactions between the individual components of the system are fairly well-defined, and dictated largely by whatever messaging protocol is used (cf., two phase commit, Paxos, etc.)  In reality, the modes of interaction are vastly more complex and subtle than simply reasoning about state transitions and messages, in the abstract way that distributed systems researchers tend to cast things.

Let me give a concrete example. Recently we encountered a problem where a bunch of jobs in one datacenter started crashing due to running out of file descriptors. Since this roughly coincided with a push of a new software version, we assumed that there must have been some leak in the new code, so we rolled back to the old version -- but the crash kept happening. We couldn't just take down the crashing jobs and let the traffic flow to another datacenter, since we were worried that the increased load would trigger the same bug elsewhere, leading to a cascading failure. The engineer on call spent many, many hours trying different things and trying to isolate the problem, without success. Eventually we learned that another team had changed the configuration of their system which was leading to many more socket connections being made to our system, which put the jobs over the default file descriptor limit (which had never been triggered before). The "bug" here was not a software bug, or even a bad configuration: it was the unexpected interaction between two very different (and independently-maintained) software systems leading to a new mode of resource exhaustion.

Somehow there needs to be a way to perform offline analysis and testing of large, complex systems so that we can catch these kinds of problems before they crop up in production. Of course we have extensive testing infrastructure, but the "hard" problems always come up when running in a real production environment, with real traffic and real resource constraints. Even integration tests and canarying are a joke compared to how complex production-scale systems are. I wish I had a way to take a complete snapshot of a production system and run it in an isolated environment -- at scale! -- to determine the impact of a proposed change. Doing so on real hardware would be cost-prohibitive (even at Google), so how do you do this in a virtual or simulated setting?

I'll admit that these are not easy problems for academics to work on. Unless you have access to a real production system, it's unlikely you'll encounter this problem in an academic setting. Doing internships at companies is a great way to get exposure to this kind of thing. Replicating this problem in an academic environment may be difficult.

Pushing the envelope on new computing platforms: I also wish the systems community would come back to working on novel and unconventional computing platforms. The work on sensor networks in the 2000's really challenged our assumptions about the capabilities and constraints of a computer system, and forced us down some interesting paths in terms of OS, language, and network protocol design. In doing these kinds of explorations, we learn a lot about how "conventional" OS concepts map (or don't map) onto the new platform, and the new techniques can often find a home in a more traditional setting: witness how the ideas from Click have influenced all kinds of systems unrelated to its original goals.

I think it is inevitable that in our lifetimes we will have a wearable computing platform that is "truly embedded": either with a neural interface, or with something almost as good (e.g. seamless speech input and visual output in a light and almost-invisible form factor). I wore my Google Glass to HotOS, which stirred up a lot of discussions around privacy issues, what the "killer apps" are, what abstractions the OS should support, and so forth. I would call Google Glass an early example of the kind of wearable platform that may well replace smartphones, tablets, and laptops as the personal computing interface of choice in the future. If that is true, then now is the time for the academic systems community to start working out how we're going to support such a platform. There are vast issues around privacy, energy management, data storage, application design, algorithms for vision and speech recognition, and much more that come up in this setting.

These are all juicy and perfectly valid research problems for the systems community -- if only it is bold enough to work on them.

Sunday, April 21, 2013

The other side of "academic freedom"

My various blog posts about moving from academia to industry have prompted a number of conversations with PhD students who are considering academic careers. The most oft-cited reason for wanting a faculty job is "academic freedom," which is typically described as "being able to work on anything you want." This is a nice theory, but I think it's important to understand the realities, especially for pre-tenure, junior faculty.

I don't believe that most professors (even tenured ones) can genuinely work on "anything they want." In practice, as a professor you are constrained by at least four things:
  • What you can get funding to do;
  • What you can publish (good) papers about;
  • What you can get students to help you with;
  • What you can do better than anyone else in the field.
These are important limitations to consider, and I want to take them one by one.

Funding doesn't come easy. When I was a PhD student at Berkeley, I was fortunate to be a student of David Culler's, who had what seemed like an endless supply of funding from big DARPA and NSF grants, among others. When I went to start my faculty career, he (and many others) told me I would have "no problem" getting plenty of funding. This turned out not to be true. Shortly after I started my faculty job, DARPA all but shut down their programs in computer science, and NSF grants became heavily constrained (and much more competitive). Being a freshly-minted faculty member meant I was essentially a nobody, but that didn't mean that NSF review panels took pity on me -- apart from special programs like the CAREER award, you're competing with the most senior, established people in your field for every grant. To make matters worse, I didn't have a lot of senior colleagues in my area at Harvard to write proposals with, so I mostly had to go it alone.

Now, I will readily admit that I suck at writing grants, although according to my colleagues my hit rate for funding was about on par with other profs in my area. However, there were several projects that I simply could not do because I couldn't get funding for them. I tried for four years to get an NSF grant for our work on monitoring volcanoes with sensor networks -- which was arguably the thing I was most famous for as a professor. I failed. As a result we never did the large-scale, 100-node, multi-month study that we had hoped to do. It was a huge disappointment and taught me a valuable lesson that you can't work on something that you can't get funding for.

Who decides which problems are sexy (and therefore publishable)? I'll tell you: it's the 30-some-odd people who serve on the program committees of the top conferences in your area year after year. It is very rare for a faculty member to buck the trend of which topics are "hot" in their area, since they would run a significant risk of not being able to publish in the top venues. This can be absolutely disastrous for junior faculty who need a strong publication record to get tenure. I know of several faculty who were denied tenure specifically because they chose to work on problems outside of the mainstream, and were not able to publish enough top papers as a result. So, sure, they could work on "anything they wanted," but that ended up getting them fired.

Now, there are some folks (David Culler being one of them) who are able to essentially start new fields and get the community to go along with them. I argue that most professors are not able to do this, even tenured ones. Most people have to go where the funding and the publication venues are.

What can you get students to work on? I don't mean this in a kind of grad-students-won't-write-unit-tests kind of way (although that is also true). What I mean is how likely is it that you will find grad students in your field who have the requisite skills to undertake a particular research agenda? In my case, I would have killed for some students who really knew how to design circuit boards. Or students who had some deep understanding of compiler optimization -- but still wanted to work on (and publish) in the area of operating systems. A bunch of times I felt that the problems I could tackle were circumscribed by my students' (and my own) technical skills. This has nothing to do with the "quality" of the students; it's just the fact that PhD students (by definition) have to be hyper-specialized. This means that grad students in a given area tend to have a fairly narrow set of skills, which can be a limitation at times.

Can you differentiate your research? The final (and arguably most important) aspect of being successful as a faculty member is being able to solve new problems better than anyone else in your area. It is not usually enough to simply do a better job solving the same problem as someone else -- you need to have a new idea, a new spin, a new approach -- or work on a different problem. Hot areas tend to get overcrowded, making it difficult for individual faculty to differentiate themselves. For a while it felt like everyone was working on peer-to-peer networking. A bunch of "me too" research projects started up, most of which were forgettable. Being one of those "me too" researchers in a crowded area would be a very bad idea for a pre-tenure faculty member.

Do things get better after tenure? I didn't stick around long enough to find out, so I don't know. I definitely know some tenured faculty who are coasting and care a lot less about where and how much they publish, or who tend to dabble rather than take a more focused research agenda post-tenure. Certainly you cannot get fired if you are not publishing or bringing in the research dollars anymore, but to me this sounds like an unsatisfying career. Others -- like David Culler -- are able to embark on ambitious, paradigm-shifting projects (like NOW and TinyOS) without much regard to which way the winds are blowing. I think most tenured faculty would agree that they are subject to the same sets of pressures to work on fundable, publishable research as pre-tenure faculty, if they care about having impact.

Okay, but how much freedom do you have in industry? This is worth a separate post on its own, which I will write sometime soon. The short version is that it depends a lot on the kind of job you have and what kind of company you work for. My team at Google has a pretty broad mandate which gives us a fair bit of freedom. But unlike academia, we aren't limited by funding (apart from headcount, which is substantial); technical skills (we can hire people with the skills we need); or the somewhat unpredictable whims of a research community or NSF panel. So, yes, there are limitations, but I think they are no more severe, and a lot more rational, than what you often experience as an academic.



Monday, April 8, 2013

Running a software team at Google

I'm often asked what my job is like at Google since I left academia. I guess going from tenured professor to software engineer sounds like a big step down. Job titles aside, I'm much happier and more productive in my new role than I was in the 8 years at Harvard, though there are actually a lot of similarities between being a professor and running a software team.

LIKE A BOSS.
I lead a team at Google's Seattle office which is responsible for a range of projects in the mobile web performance area (for more background on my team's work see my earlier blog post on the topic). One of our projects is the recently-announced data compression proxy support in Chrome Mobile. We also work on the PageSpeed suite of technologies, specifically focusing on mobile web optimization, as well as a bunch of other cool stuff that I can't talk about just yet.

My official job title is just "software engineer," which is the most common (and coveted) role at Google. (I say "coveted" because engineers make most of the important decisions.) Unofficially, I'm what we call a "Tech Lead Manager," which means I am responsible both for the technical direction of the team as well as doing the people management stuff. (Some people use the alternate term "Über Tech Lead" but this has one too many umlauts for me.) A TLM is not a very common role at Google: most teams have separate people doing the TL and M jobs. I do both in part because, being based out of Seattle, it doesn't make sense to have my team report to a "regular" manager who would likely be in Mountain View. Besides I'm really happy to do both jobs and enjoy the variety.

There are four main aspects to my job: (1) Defining the technical agenda for the team and making sure we're successful; (2) Writing code of my own; (3) Acting as the main liaison between our team and other groups at Google, and (4) Doing the "people management" for the team in terms of hiring, performance reviews, promotion, and so forth.

Academics will immediately recognize the parallels with being a professor. In an academic research group, the professor defines the technical scope of the group as well as mentors and guides the graduate students. The big difference here is that I don't consider the folks on my team to be my "apprentices" as a professor would with graduate students. Indeed, most people on my team are much better software engineers than I am, and I lean on them heavily to do the really hard work of building solid, reliable software. My job is to shield the engineers on my team from distractions, and support them so they can be successful.

There are of course many differences with academic life. Unlike a professor, I don't have to constantly beg for funding to keep the projects going. I have very few distractions in terms of committees, travel, writing recommendation letters, pointless meetings. Of course, I also don't have to teach. (I loved teaching, but the amount of work it requires to do well is gargantuan.) Most importantly, my team's success is no longer defined through an arbitrary and often broken peer review process, which applies to pretty much everything that matters in the academic world. This is the best part. If we can execute well and deliver products that have impact, we win. It no longer comes down to making three grumpy program committee members happy with the font spacing in your paper submissions. But I digress.

I do spend about 50% of my time writing code. I really need to have a few solid hours each day hacking in order to stay sane. Since I don't have as many coding cycles (and service more interrupts) than other people on my team, I tend to take on the more mundane tasks such as writing MapReduce code to analyze service logs and generate reports on performance. I actually like this kind of work as it means dealing with a huge amount of data and slicing and dicing it in various interesting ways. I also don't need to show off my heroic coding skills in order to get promoted at this point, so I let the folks who are better hackers implement the sexy new features.

I do exert a lot of influence over the direction that our team's software takes, in terms of overall design and architecture. Largely this is because I have more experience thinking about systems design than some of the folks on my team, although it does mean that I need to defer to the people writing the actual code when there are hairy details with which I am unfamiliar. A big part of my job is setting priorities and making the call when we are forced to choose between several unappealing options to solve a particular problem. (It also means I am the one who takes the heat if I make the wrong decision.)

I reckon that the people management aspects of my job are pretty standard in industry: I do the periodic performance reviews for my direct reports, participate in compensation planning, work on hiring new people to the team (both internally and externally), and advocate for my team members when they go up for promotion. Of course I meet with each of my direct reports on a regular basis and help them with setting priorities, clearing obstacles, and career development.

The most varied part of my job is acting as the representative for our team and working with other teams at Google to make amazing things happen. My team is part of the larger Chrome project, but we have connections with many other teams from all over the world doing work across Google's technology stack. I am also frequently called into meetings to figure out how to coordinate my team's work with other things going on around the company. So it never gets boring. Fortunately we are pretty efficient at meetings (half an hour suffices for almost everything) and even with all of this, my meeting load is about half of what it was as an academic. (Besides, these meetings are almost always productive; compared to academic meetings where only about 10% of them have any tangible outcome.)

Despite the heavy load and lots of pokers in the fire, my work at Google is largely a 9-to-5 job. I rarely work on the evenings and weekends, unless there's something I'm really itching to do, and the volume of email I get drops to near-zero when it's outside of working hours. (Although I am on our team's pager rotation and recently spent a few hours in the middle of the night fixing a production bug.) This is a huge relief from the constant pressure to work, work, work that is endemic of professors. I also feel that I get much more done now, in less time, due to fewer distractions and being able to maintain a clear focus. The way I see it is this: If I'm being asked to do more than I can get done in a sane work week, we need to hire more people. Fortunately that is rarely a problem.

Disclaimer: Everything in this post is my personal opinion and does not represent the view of my employer.

Thursday, March 21, 2013

Looking back on 1 million pageviews

This blog just hit one million pageviews:

Seems like a pretty cool milestone to me. I never imagined I'd get so much traffic.

Just for fun, here are the top five most popular posts on this blog so far:

Why I'm Leaving Harvard (99263 pageviews), in which I announce my departure from Harvard to Google. I guess this post became a kind of touchstone for a bunch of people considering an academic career, or those who also made the decision to leave academia. I'm often asked whether I still think I made the right decision after nearly 3 years at Google. The answer is a resounding yes: I'm extremely happy and my team is doing amazing things - some of which you can read about here.

So, you want to go to grad school? (43314 pageviews), in which I try to give an honest assessment of why someone should (or should not) do a PhD in Computer Science. The main thing I try to dispel is this myth that you should "take a year off" and work in industry before going to grad school. Way too many students tell me that they plan to do this, and I think it is a really bad idea if you are serious about doing a PhD.

Day in the life of a Googler (33885 pageviews), which was intended as a tongue-in-cheek look at the difference between a day at Google and a day as a professor. Somehow this got taken seriously by people, and someone sent me a link to a Chinese translation that was getting a lot of hits and comments (in Chinese). My guess is that the intended humor was lost in translation.

How I almost killed Facebook (28367 pageviews), an early post about the time I tried to talk Mark Zuckerberg out of dropping out of Harvard to do a startup. Thankfully he did not listen to me.

Programming != Computer Science (25794 pageviews), a little rant against grad students who seem to mix up writing software with doing research.

Of course, not all of my posts have been widely read. Going back over them, it looks like the ones with the smallest number of hits focus on specific research topics, like my trip report for SenSys 2009 (115 pageviews!) and an announcement for postdoc openings in my group (a whopping 68 pageviews). I guess I should stick to blogging about Mark Zuckerberg instead.



Tuesday, March 19, 2013

Moving my life to the cloud

http://www.flickr.com/photos/clspeace/2250488434/
I'm in the process of moving my (computing) life entirely to the cloud -- no more laptop: just a phone, tablet (which I use rarely) and a Chromebook Pixel. My three-year-old MacBook Pro is about to croak, and it seems like now is the time to migrate everything to the cloud, so I can free myself from having to maintain a bunch of files, music, photos, applications, and backups locally. I'd really like to be in a place where I could throw my laptop out a moving vehicle and not care a bit about what happens to my data. Still, there are some challenges ahead.

The Chromebook Pixel itself is a sweet piece of kit. The keyboard and trackpad are nearly as good as my Mac, and the screen resolution is simply unreal: you CANNOT see the pixels (ironic choice of product name; as if the next version of a Mac would be the "MacBook Virus"). It boots in 10 seconds. Hell, the other day I did a complete OS upgrade (switching from the beta to the dev channel), which took no more than 10 seconds -- including the reboot. The Pixel comes with 1 TB (!) of Google Drive storage, so at this point there's no excuse for not storing all my stuff in the cloud -- this is more space than any laptop I've ever owned.

But you only get to use Chrome!?!? Working at Google, I spend about 70% of my time in Chrome already, so the environment is pretty much exactly what I need. The other 30% of my time is spent ssh'ing into a Linux machine to do software development. The Secure Shell Chrome extension provides a full-on terminal emulator within the browser. I pretty much only use the shell and vim when doing development, so this setup is fine for me.

Since I left academia, I don't have much need for writing papers in LaTeX and doing fancy PowerPoint slides anymore. If those were still major uses of my time, I'd have to find another solution. Google Docs works perfectly well for the kind of writing and presentations I do these days; in fact, the sharing capabilities turn out to be more important than fancy formatting.

What about working offline!?!?! Who the hell ever works offline anymore? I certainly don't. Even on airplanes, the majority of the time I have WiFi. I generally can't get any work done without an Internet connection, so optimizing for "offline" use seems silly to me. If I'm really offline, I'll read a book.

Music? Google Play Music and the Amazon Cloud Player work great. I have a huge music library (some 1,200 albums) which I keep in both places.

Movies and TV shows? It's true that iTunes has the best selection, but what's available on Google Play and Amazon Instant Video is pretty good. I mostly watch movies and TV on my actual TV (crazy, I know) but for "on the road" I think streaming content will work well enough. There's no real offline video playback on the Chromebook as far as I know; for that I can use my Android tablet though. Netflix apparently works fine on the Chromebook, although I unsubscribed from Netflix when they started screwing people over on their pricing.

Of course, it's not all roses. A few pain points, so far:

Migrating my photo library to the cloud was more painful than I had hoped. I have around 70 GB of pictures and videos taken over the years, and wanted to get it onto Google Drive so I'd have direct access to it from the Chromebook. This involved installing the Google Drive Mac app which allowed me to copy everything over, although the upload took a day or so, and it wasn't clear at first if everything was syncing correctly. (I also had to make sure not to sync the photo library on my other machines which had the Drive app installed.)

Managing photos in the cloud still kind of sucks. I'm not happy with any of the cloud-based photo library management solutions that I've found. I have a Flickr Pro account which I use for sharing select pictures with family and friends, but I don't feel comfortable uploading all of my photos to Flickr. I could use Google+, however, it's more focused on sharing rather than large library management. I am not sure what is going on with Picasa these days. Dropbox is another option, which I use for general files, but its photo management is pretty rudimentary as well. For now I'm going to make do with the bare-bones photo support in Google Drive and think about a better way to manage this. What's cool is that I already take all of my photos on my phone which automatically syncs then to both Google Drive and Dropbox, so there's never a need to physically plug the phone in to anything.

Editing plain text files is -- surprisingly -- kind of hard. About the only use I have for plain text files (apart from coding) anymore is writing paper reviews -- I read a PDF in one window; fill in the plain-ASCII review form for HotCRP in the other. There are a couple of Chrome extensions with bare-bones text editors, but it's a far cry from a full-fledged editor. I am experimenting with Neutron Drive, which is a pretty cool editor/IDE Chrome Extension which uses Google Drive in the backend. Maybe I'll have to change my habits and just fill in my reviews in HotCRP directly (see above about not being able to get any work done offline).

Where to keep my really private stuff? By which I mean porn, of course. Or tax returns. Or anything I don't want (or can't) store in any of the cloud services. This article from VentureBeat does a good job at summarizing the policies of the popular cloud storage providers, but the upshot is that all of them have some mechanism to either take down objectionable content or report it to law enforcement.

What I'd really like is to set up a "private cloud", perhaps running a server at home which I could then access (securely) over the web. There are several solutions for private encrypted cloud storage out there (like Arq and Duplicati), but most of them require some form of specialized client (which won't work on ChromeOS any time soon). I guess I could run a WebDAV server or something on a local box or even a machine in the cloud which I could access through the browser. Still, I'm not sure what to do about this yet. It seems insane to me that it's 2013 and we still don't know how to get file sync right.

Disclaimer: Everything in this post is my personal opinion and does not represent the view of my employer.

Wednesday, February 27, 2013

Grad students: Learn how to give a talk

I've been to roughly a hundred academic conferences and listened to thousands of talks, mostly by grad students. Over the years, the quality of the speaking and presentations has not gotten any better -- if anything, it's gotten worse. A typical grad student talk is so horribly bad, and it's surprising how little effort is put into working on presentation and speaking skills, especially given how important this skill is for academics.

Grad students need to learn how to give good, clear, compelling presentations. Especially those who think they want to be professors one day.

It is difficult to overstate how important presentation skills are for academics. This is about much more than "being a good teacher" (which is a nice trait to have, but not actually that important for an academic's career in the long run). There is a huge division between the professors who are influential leaders, and those who are also-rans. In almost all cases that I can think of, the professors who are very successful are also good speakers, and good communicators overall. They can give good, clear, funny talks. They can engage in meaningful conversations at a technical level and at a personal level. They have a strong command of English and can use the language effectively to communicate complex ideas. So I claim that there is a strong correlation between good communication skills and overall research impact.

In some sense, a professor's job is to communicate the research ideas being done in their group. Although grad students often give the conference talks, professors give countless other talks at other universities, companies, workshops, and elsewhere. The professors write the grant proposals, and often the papers (or good chunks of them) as well. Once you're a professor, it matters a lot less how good of a hacker you are -- your job is to be the PR rep.

So it's surprising that grad students generally receive no formal training in presentation skills. A typical grad student might get three or four opportunities to give conference talks during their Ph.D., but this is hardly enough practice to hone their skills. Acting as a TA or giving "practice talks" isn't much help either. I honestly don't know how to fix this problem, short of running a course specifically on giving good presentations, which sounds like a drag -- but might be necessary.

The language barrier is a big part of the problem. Students who do not have English as their first language are almost invariably worse at giving talks than those who are native speakers, and students from Asia tend to be worse than those from Europe. (In academic Computer Science, English is the only language that matters.) But it's more than just command of the language -- it's about being expressive, funny, charismatic. The grad student who stands frozen in place and reads off their slides might speak English perfectly well, but that doesn't make them a good speaker.

It's also true that grad students are often "sized up" at conferences based on their speaking skills. If you can give a good talk at a conference, you'll get the attention of the professors who will be looking at your faculty job application later. Likewise, if your talk sucks, it's going to leave a bad impression (or, at best, you'll be forgettable).

So, please, grad students: If you're serious about pursuing an academic career, hone your presentation skills. This stuff matters more than you know.

Startup Life: Three Months In

I've posted a story to Medium on what it's been like to work at a startup, after years at Google. Check it out here.