The Sad Reality of Post-Mature Optimization

EDIT: Some graphics are missing, lost during a blog abandonment and then revival, but the general idea is that there is seldom one big outlier that you can quickly fix up and everything will be great. In real projects, and I’ve seen this countless times, significant performance issues infect the entire code base, and often its overall architecture, to the point that little can be done beyond throwing out and rebuilding.

Donald Knuth famously said that Premature Optimization is the root of all evil. People incant it as a retort to pretty much any and all commentary on runtime performance, grossly misrepresenting or misinterpreting Knuth’s original intention.

When is it a mature time to start to consider performance, however?

Most will say the time to optimize is once you’re fairly well along in development: Hook a profiler up and find the low hanging fruit, achieving most of what you would have achieved if you focused intently on performance all along. This is the common wisdom.

After many years and many projects, I must vigorously disagree. It seldom turns out like that.

Here’s the function runtimes of an idealized hypothetical program, written with performance as a driving consideration.

ABCDEFG

When we fire up our profiler on our “no premature optimization!” project, we expect to be confronted by something like this.

AAAAAAAAAAAAABBCDEFG

Some quick optimizations of A and we have our ideal solution.

It never actually happens that way. Ever. When performance wasn’t a consideration from day one, inefficiency becomes endemic, corrupting every square mibbibyte of the project.

Why not iterate a giant array instead of using hashes Why not use LINQ everywhere, repeatedly and inefficiently filtering a massively excessive superset of the needed data?

Most of the time our profile report looks more like the following.

AAAAAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCCCCCCCCCC…

It is devastatingly common: If performance doesn’t matter, it really won’t matter and it is seldom something that you can fix by a few updates in a single critical path function. It infects everything. Moore’s law isn’t there to save you either — the speed of a single core has barely improved in years. Just as nine women can’t produce a baby in one month, unless you’ve put down on parallelization, the single delivery performance is limited by that sloth. The fastest render time of a SugarCRM PHP page has a lower bound in the >1s range, even if running on the fastest computer in the world. In the web app world this traditionally hasn’t really mattered — people just became accustomed to every interaction taking seconds — but it has become a serious competitive advantage of a few offerings. I love Rdio*, yet if a competitor offered a solution that didn’t have the second long pauses between every interaction (just navigating between pages), I would seriously consider switching. One of the biggest reasons I abandoned Flickr was its obnoxiously slow navigation and interactions.

Just as you can’t add quality to your code just before shipping, seldom can you significantly improve performance without dramatic rewriting. In the same way you can’t easily parallelize an application that wasn’t written as such in the first place. These are principal considerations. I have little more to add over what I said four years ago on the topic of premature optimization. People have broadened Knuth’s intention to the point of absurdity. When people talk about premature optimization, seldom is the context that intended by Knuth.

Update: A torrent of readers and my meager server’s CPU usage sits at a constant <1%. Hardly miraculous — it should be the norm — but compare it to the countless sites that fall over when they receive any attention at all. When I built this trivial blog engine (EDIT: The wonderful performance .NET blog engine was later abandoned, largely because I never had the time or motivation to make content management tools), largely for a laugh, efficiency was a principal consideration right at the outset, and was an architectural concern with every decision. This is actually a mostly unrelated aside
(caching happens to be the easiest thing to layer overtop of a terrible solution, so it doesn’t really follow with this post. It is more probably a counterpoint than a proof), but I had to take the opportunity to encourage everyone to be prepared to go to the frontpage of various social news sites. It is bizarre to endlessly see “too many database connections” when browsing the top links collections. That is an issue that we’ve known the solution to for a decade and a half.

* – Rdio is a great service, but I had to take a moment to point and laugh at their “native” application that is now available. It, they say, frees you from the browser. It does this by giving you a ClickOnce .NET application that needs to be updated seemingly every time I launch it. In return for the constant updates, it offers you an identical experience to the web application, only using far more resources (it regularly consumes 300MB+) and actually seeming to offer a slower interface. Not sure what drove that move, but it was very poorly conceived.