Lazy Surveys Enable A Lazy Press

On Monday I posted an entry titled Lies, Damn Lies, and surveys, the focus being that surveys — especially online surveys — are often of dubious merit, and worse are often intentionally or ignorantly misinterpreted by the press.

Yet it works wonders for both the press and the marketing research companies: There is a whole industry of lazy tech writers who will run with whatever is sent to them. Any slant they provide merely globs extra vaseline on the lens of accuracy.

In that case I was spurred by some just released results of an online survey done by GfK.

Take a look at the Fortune summary of said survey. As is the case with most other retellings, the Fortune writer seemed to simply rewrite the original Reuters story which itself said:

The survey found that just 25 percent of smartphone owners planned to stay loyal to the operating system running their phone, with loyalty highest among Apple users at 59 percent, and lowest for Microsoft’s phone software, at 21 percent. Of users of Research in Motion’s BlackBerrys, 35 percent said they would stay loyal.

The figure was 28 percent for users of phones running Google’s Android software, and 24 percent for users of Nokia Symbian phones.

The Fortune story includes a big graph with the title “Plan to stick with your smartphone OS?“. The iPhone towers at 59%, with Android down at a miserable 28%.

Wow, looks pretty rough for Android! For anyone whose work relies upon smartphone trends, this is pretty big news.

So I emailed GfK and they kindly responded with the same press release that they had sent out to the press. Unfortunately it still is only a summation, including no notes on methodology or the actual questions asked (which can often be very leading), yet it is far better than most survey companies that simply hide their summary behind a paywall, enjoying the attention as the press mangles it into something more hit magnetic.

Here’s the table that was the source for most of these stories.I have to guess (based upon the surrounding wording) that it was the responses to a question asking current smartphone owners what they’re going to consider when they next upgrade their phone.

Smartphone Ecosystem
Overall Apple Nokia
(Symbian)
BlackBerry Windows
Mobile
Android
Will stay loyal to smartphone OS 25% 59% 24% 35% 21% 28%
Will stay loyal to smartphone OS
but switch handset make
7% 8% 16%
I will look at all options 56% 36% 60% 58% 61% 49%
Will switch smartphone OS 6% 1% 8% * 5% 2%
Don’t know 7% 4% 7% 6% 4% 4%

There’s Android at 28% among these single choice options. That’s the meat of a lot of easy “news” stories.

Note the second line, though: “Will stay loyal to smartphone OS but switch handset make“. Sum those two lines and for Android suddenly you’re at 44%. Sure, maybe someone with an HTC Evo 4G is looking longingly at a Samsung Galaxy S, or maybe they’re imagining getting that Android-powered Playstation phone when it comes out, but we (meaning virtually every single press reference to this survey) are talking about the OS here.

So Reuters (and any that followed Reuter’s lead) couldn’t manage to achieve a grade-school reading ability. I’m serious: my daughter had table reading assignments in grade 1 that were just like this, a job that many in the tech press would fail.

Extraordinary.

But it gets even better. Among those who answered “I will look at all options” (which ideally should be 100% of respondents), here are what each current platform’s users are considering as their considerations when they upgrade at some point in the future (multiple choices obviously being allowed).

Current Smartphone Owners
Overall Apple Blackberry Nokia
(Symbian)
Windows
Mobile
Android
Apple iOS 53% 85% 46% 47% 43% 38%
Blackberry OS 33% 22% 74% 34% 41% 22%
Symbian 23% 14% 18% 40% 18% 13%
Windows Phone 7 41% 28% 40% 47% 65% 31%
Android 51% 40% 41% 48% 42% 84%

It paints a rather different narrative than the common,egregiously wrong interpretation. You end up with hilarious ignorance like this nonsense. With what you have learned here, check out the chart here and see if you can figure out what’s wrong.

The iPhone still takes the lead (with loyalty oddly dramatically higher in Germany — the home country of the survey organization —than in any other nation), but look back at the original Reuters statement: “The survey found that just 25 percent of smartphone owners planned to stay loyal to the operating system running their phone…the figure was 28 percent for users of phones running Google’s Android software“.

I suppose that sounds better than the more accurate “44% of Android users have pledged undying allegience to the operating system, regardless of the endless and unpredictable changes among competing choices. Of the remainder, the vast majority still expect to hop back on the Android train.

The Underappreciated Art of Duct Tape Programming

Three Kings

In an earlier, more naïve era of my career, I had three software development “heroes”: Jamie Zawinski, John Carmack and Joel Spolsky.

Jamie Zawinski grabbed my attention because he was at the center of the internet revolution, right in the bowels of Netscape from the outset, and had established a pattern of posting surprisingly pragmatic comments that defied convention.

It was extraordinary to read someone openly critical of their own organization, especially without it being retracted or redacted the sobered-up or calmed-down next day, and where the author didn’t hide behind anonymity.

Jamie let us commoners see the sometimes ugly mechanics behind the curtain. He also revealed a very interesting workplace that was foreign to the gray-walled cube world that most of us lived in.

Georgetown Fall Fair

This was at a time when Microsoft really had almost unthinkable dominance over the industry, so to hear Jamie discuss the travails of cross-platform development was like going out of bounds at a tourist resort. Seeing what the brochure didn’t show you.

An SGI box IRIX How exotic!

Another of the kings, John Carmack, was blessed with “F-you money” from the incredible success of some of his earlier projects, along with a proven abundance of intelligence and skills, so he too had the luxury of entertaining a surprisingly realistic and pragmatic perspective. He was a principal driver of the evolution of GPUs and gaming hardware, and you can owe thanks for some of the capabilities of your console or dual-GPU rig to his desire to make shooting things in first-person shooters hyper realistic.

Carmack was also one of the original “bloggers”, regularly posting lengthy “blog entries” by repurposing the UNIX finger facilities.

Joel Spolsky is a bit of the odd-man out in this trio. While he did have the requisite first-initial, he wasn’t known for extraordinary technical acumen (beyond having worked on Excel in some earlier life), but hear me out, please.

Joel ascended to Kingship – at least in my personal hierarchy of industry royalty – just after the dot com crash, when CMM factory-line initiatives started to become the mythical silver-bullet: This was an era awash with articles gushing about the amazing adoption of CMM5 among offshoring firms.

Many organizations were striving to reduce software development to an assembly line of easily interchangeable cogs, both of code and people, achieving a utopia where the process would become perfectly predictable and repeatable if only you filed enough forms.

Joel spoke up for developers when most were absurdly blaming the .COM collapse on dual-monitors, Aeron chairs, and inflated developer egos, as if taking developers down a notch and having them sit on a cold rock would have made selling kitty litter online a good idea.

He was essentially an enlightened pointy-hair blogger, and while I wouldn’t look to his blog for technical advice (Wasabi!), he really understood developers and the process of getting software built. And he was willing to risk his own nest egg and put his money where his mouth is, having since built a reasonably successful company in Manhattan that most of us should be envious of.

Unbound by Convention

What made these three really stand apart in a sea of cheap advice-givers and pundits was that none of them were writing to get a job or even necessarily to keep one. Joel made his own bank while the other two were of such technical esteem that they had little to worry about professionally regardless of what they might say.

They weren’t coerced into railing off the latest buzzwords and best practices, deferring to the latest silver-bullet best practice pattern-based UML diagramming system 3NF data warehousing factory built on a n-tier service-oriented, aspect-oriented, polymorphic framework so that they can get the approving nods from the nervous
masses and clueless PHBs.

They didn’t worry about offending a boss who held some sacred cow that if only you did it the way she read on some best practices blog, everything would be fabulous, at least until that initiative fails and you move on to the next cure-all.

The three kings were just saying it like they saw it, which was and still is rare.

Eventually Joel ran out of things to talk about and switched his blog to mechanically regurgitated repeats; Carmack got lost endlessly perfecting the noble quest of simulating head shots when he wasn’t reaching for the stars; and Zawinski decided to engage in endless battles with the city of San Francisco over his money sink of a late-night dance club (if you read his blog about DNA Lounge early on, you could almost smell the contractors taking this dotCOMinaire for a ride).

Maligning Metaphors

I was delighted to see Joel return from effective blogging retirement, and my enthusiasm exploding when I saw that it was a post about Zawinski!

A royal duet!

Okay really Joel was selling a book – like his partner-in-crime Atwood, he seems to be motivated to post by Amazon Affiliate bucks these days, credibility undermined by that kickback – however he chose the Zawinski chapter as his pitch, talking admiringly about how practical and “get ‘er done” (paraphrased) jwz was about his craft, doing so with a present tense that betrays a certain blissful ignorance about Jamie’s career path since.

Joel labeled Jamie the “Duct Tape Programmer”, which was a description that Jamie took as “damning with faint praise”. Joel has long been against architectural astronauts, so he seemed to excitedly hold up Jamie as the successful counterpoint.

Perhaps “duct tape” is a bit of a metaphorical overreach, causing many to envision some Tim the Toolman ‘Ar ar ar’ hack.  Pragmatic or practical probably would have been more accurate, though it would have made for a less contentious entry.

Never mind that Jamie worked within extremely tight timelines, using technology far less advanced than what we have now.

Joel’s entry raged across the social news sites, with the regular suspects popping out of the woodwork to declare it a grievous offense to all that is all that is good in the world of software development. Lots of blog replies parroting the standard best practices appeared, their authors clearly hoping that their boss and any future employers will see how studious and diligent of worker bees they are.

Who Decides on Best Practices?

The people who are the most certain about software development patterns, practices, and technologies are usually the people who have the least reason to have such certainty.

I’m going to be a bit trollish while I go to the extreme and say that many of the oft-quoted leaders of the field, responsible for much of the unquestioned wisdom-bites, have little to demonstrate why they’re in a position to preach.

The revered Fred Brooks, author of the Mythical Man-Month, came into a position of considerable influence largely by leading a project that was by most accounts a massive failure. That would be fine if there was but one way to fail and he found it for the benefit of all, but there are an infinite number of ways that a project can fail.

Of course you must learn from failures, but my experience has been that the explanations for failure are often a worse-than-useless distractionary tactic: When a team technically fails to accomplish what they set out to do, expect the post-mortem to be full of nonsensical misdirection about how everyone and everything else is to blame.

How many post-mortems include the statement “I grossly overestimated my own capabilities” I suspect few.

Steve McConnell is another well-known author in the field, revered for his software development books (though many strangely overlook After the Gold Rush, where McConnell knee-jerk responds to the dot COM collapse by advocating an ill-considered licensing system for software developers), but his professional experience seems to be limited to working on TrueType at Microsoft, and some nebulous software development at Boeing, after which he took on the role of telling the world how software should be developed. Now he consults with pointy-hair bosses to unknown outcomes.

Don’t get me wrong, I have both of them in the bookshelf behind me, and read and greatly enjoy their opinions (Brooks’ observation about second systems is more profound and important, I think, that the over-quoted man-month snippet), but really, let’s keep some perspective and stop using it like they’re the incontestable word of truth.

I read them critically and with an open mind, not taking it as the voice from an all knowing deity, but instead the perspectives of a couple of guys drawing from their
experiences.

Georgetown Fall Fair

Of course, the esteemed Fred Brooks and Steve McConnell exist in a realm far above most silver-bullet cheerleaders in our industry. These successful authors actually
dirtied their hands with actual software development, refactoring their opinions over the years into refined perspectives. I select them merely as “absolute best case” examples.

More commonly the people who most ardently advocate certain practices and approaches have achieved little, usually having nothing to back their conviction but self-interest and a desire to look like they know what they’re talking about, having associated their id with “correct” approaches.

They just clutch onto whatever they hear is proper and start repeating it like a novelty birthday card repeatedly opened. They’ll tell you that should develop like an ecommerce site, despite not being an ecommerce site; like you’re NASA, despite not being NASA; like you make the software for a pacemaker, despite actually making an ebay auction sniping tool.

Why do I hear the word “pattern” from mediocre or non-developers more than I hear about it from experienced developers, always stated as some sort of conclusive statement?

Why do we accept that a chimp-level of software development skill is acceptable for maintenance programmers, capable of understanding only the most infantile code that is carefully decorated with “Coding for Dummies” comments?

Why is “We should use UML” the desperate last-ditch fallback of failing teams everywhere?

Unit testing, or the more early-loaded TDD, can be great, but it isn’t a panacea and is an extremely poor substitute for actual craftsmanship.

Moving beyond the non-developers giving their unwanted opinion on how software should be built, the other class of destructive noise is the advocacy of silver-bullet methodologies during the honeymoon period.

Great, you built a sample app on RoR/Haskell/Scheme/python or whatever else is the cure-all platform that profoundly changed your world view.

Here’s a nickel. Go build a real app then tell us how revolutionary it is now. I don’t discount the advantages, but advocacy based upon toying around is of little use to real projects. Extrapolating it up is foolhardy.

Oh look, another guy telling us how switching to the Dvorak keyboard layout made him regular and makes his code smell like cinnamon. Here’s someone saying that they slept 4 hours a night by taking 20 minute catnaps, proven out over their two day sample period. This guy says that having a 400×200 single-app screen on a netbook made him a perfectly focused coder. Here’s a dieter who is certain that they’re onto an incredible, beats-the-laws-of-thermodynamics diet now that they’ve followed it for a whole six hours.

The Emperor Has No Clothes!

The fairy tale “The Emperor’s New Clothes” has significant relevance to the software development field. To quote the plot summary from Wikipedia

An emperor of a prosperous city who cares more about clothes than military pursuits or entertainment hires two swindlers who promise him the finest suit of clothes from the most beautiful cloth. This cloth, they tell him, is invisible to anyone who was either stupid or unfit for his position. The Emperor cannot see the (non-existent) cloth, but pretends that he can for fear of appearing stupid; his ministers do the same. When the swindlers report that the suit is finished, they dress him in mime. The Emperor then goes on a procession through the capital showing off his new “clothes”. During the course of the procession, a small child cries out, “the emperor is naked!” The crowd realizes the child is telling the truth. The Emperor, however, holds his head high and continues the procession.

Too often the software development industry suffers for lack of someone crying out. We often just go along with it, listening to the declarations of non-developers and maintenance programmers as if they speak unquestionable truth, all while discarding any counterexamples as mere aberration (“Well not every team has superstars you know! We aren’t all John Carmack!”).

Everyone withholds contrary, pragmatic “Well it isn’t quite so cut and dry…” opinion lest they look like a “hack” to a present or future employer or nervous, cargo-cult embracing peers, smiling politely while the never-coded, overconfident guy acronym drops about things he doesn’t understand in the daily stand-up.

The more you know, the more you’ve experienced, the less obvious the world becomes, and the more hesitation before offering up opinions. The less ease there is to criticize the path of others when it has yielded obvious success.

Opinions come quickly to experts and morons. Few of us are experts.

Jamie Zawinski had unique conditions under which he unquestionably succeeded. Many, with the seeming clarity of hindsight and the ability to project whatever imaginary timeline one desires, will look back and comment on how the codebase got rewritten, purportedly twice, and how eventually the product was squashed, stomped out of relevance by Microsoft (before being reborn as the game-changing Firefox), using that to draw the absurd conclusion that if it were produced “properly” at the outset, today we’d all be using Netscape 9. Then again, maybe it would have followed the disastrous arc of Chandler.

The road that leads to most successful apps is often an ugly, brutish affair filled with compromise and folly, risk-taking, detours-followed, and shortcuts pursued. That isn’t to justify them, or to diminish alternative approaches, but we should always keep our minds open, being less quick to defensively guard whatever we’re selling as the cure-all this week.

A Call Out For Success Stories

What I’d like to read more about are the success stories, and less about the professional pundits telling everyone how it ought to be done.

Of course here’s where we get into a common paradox that exists in most industries: The successes are usually off enjoying their success and wealth, less inclined to toil away their days writing blog entries extolling their “dart toss” method of architecture. We’re left with the conversation being dominated by the people who
don’t actually make software at all, telling us how grand they could make software, if they ever actually did, by following their sure-win magic formula. The conversation boards are overrun with the people who actually have so little to do that they spend their time describing the ideal way that everyone else should write software.

Parallels can be drawn with the financial world, where the snake-oil salesmen pitching the ways to make money are usually doing so because their only way of making money is pitching how to make money (Want to know the secret to making big money Send me $5, and I’ll tell you that it’s to get people to send you $5 to learn the secret of making big money). The guys actually making the money are off making the money.

This brings us full circle back to Joel’s recommendation of the book. A book that serves as one of the few opportunities we have to really read how projects succeeded, straight from the source.

It’s good if only to let the successes have a voice in the conversation.

The Most Destructive Metric in Software Development

Software development is a difficult task to meter.

It’s not for lack of trying.

For decades consultants have been evangelizing methods which, they claim, would allow an unskilled, casual observer to easily measure and compare productivity in a contextually agnostic way.

Their ultimate goal: To allow a drop-in manager, with only a superficial knowledge of the activities, skills, and complexities of a task or project, to easily compute metrics by which to dole out the frequency and intensity of whippings and rewards.

[Aside: Before anyone incorrectly presumes any of this is critical of software development managers as a group or individually, realize that it is nothing of a sort: I start with a brief analysis of the goal of such simplistic measures — most organizations would like positions, including management, to be lower-skill and easier (cheaper) to fill, and such a simplification of the role is definitely in their interest, just as many dream of the panacea of no-skill, factory-type software development — and then actually question the fact that developers themselves are often guilty of quoting these metrics. 9 times out of 10, developers have only themselves to blame for a lot of the problems with the profession. This is not yet another boring us-versus-them war cry-pandering piece, like those that top the meme charts frequently]

February ConsultaMark(SM) ProductoMatrix(TM)Results
Cog Output Proposed Action
Tom 117.6 2% Raise At Year End
Amy 111.2 1% Raise At Year End
Jacob 92.7 Forced Overtime
Serene 85.5 Replace LCD with the 14″ VGA monitor from the server room
Nellis 68.0 Creative Dismissal

The same methods — if they worked as promised — could be used to chart project progress (“We’re 7868.2 ConsultaMarks towards the 11273.9 estimated for the entire project!“).

Instead of relying upon the from-the-trenches observations of Randal the development group manager — a grizzled vet of software development who manages with a hands-on style by becoming intricately aware of the domain challenges and unique contributions of each team member — Lynn, the parachuted in middle manager, wants some simple numbers that can be sorted like her mutual fund returns, giving her some available sacrificial lambs when the next diversion-from-massive-executive-fumbles headcount reduction comes due.

Many proposed solutions have come and gone, with the most persistent being the infamous SLOC (Source Lines of Code)/LOC measure.

Source Lines Of Code

skyway_lift_bridge

SLOC, if you haven’t been afflicted with it, is an easily computed count of the number of lines of code in a given project/component/object (although first you have to agree on the definition of a “line of code”, and this is a point of debate among SLOC champions). It’s often used to count the number of lines of tested, complete code added by a particular contributor (easily accomplished with many source code repositories), allowing for the easy creation of nice little charts like the one above.

SLOC does have some quasi-legitimate uses: Given a common programming language and domain complexity, SLOC magnitude differences have a moderate correlation with general project size, and at the method level it is a rough indicator of gross complexity (see the article FxCop & Cyclomatic Complexity for a discussion of a loosely related metric, which is the number of intermediate language instructions generated from a method).

Applied at the individual or group level, usually as a cheap substitute for good management and project awareness, SLOC measurements are likely to encourage very destructive behaviors: Copy/paste coding, limited reuse of existing code found elsewhere in the organization and the industry, little motivation to prune code where necessary, overly convoluted coding, motivation for employees to only take on trivial coding tasks, and so on.

The Lemon Slice Lemon Roast

Envision a system that ranked cooks by the number of lemons they use to provide a restaurant’s service each night: You’re going to end up with a lot of dishes featuring copious stacks of lemons, even if ultimately it compromises the quality and organizational health of the establishment. While in some situations you could conceivably roughly compare overall restaurant success by the number of lemons they go through in a period, the comparison only holds true if all else remains equal (e.g. if otherwise the restaurants are very comparable, such as two restaurants serving Thai food): A deli restaurant might use very few lemons despite a healthy customer turnover, where an equally successful Greek restaurant might go through hundreds.

Far more logical would be to measure the number of dishes served– while still imperfect, it would be much more useful than the LemonMetric. There is no comparable measure, with a similar level of granularity, as “dishes served” in software development (don’t even think of mentioning the highly ambiguous “function point” metric as a simile).

Preaching To The Absentee Choir

Geez…we all know that there are significant problems with the SLOC metric!” many will inevitably retort. “This is old news. You’re preaching to the choir!

…but having said that, I saw a recent article that claimed that the average developer does {X} lines of vetted code a year. Are they really that slow Me and my team must generate at least 20{X} a month! I hear that some superstars are responsible for 200,000 SLOC a year. They must be awesome!

Comments just like that are probably being typed into a TEXTAREA at this very moment.

coffees

Why do so many comments about productivity — even in the comfort of secret No-PHB hideouts — inevitably elicit gloating commentary about personal SLOC accomplishments Why do we hear gushing superlatives about the “superstars who push out 100s of thousands of SLOC a year”?

Why do so many in this industry perpetuate this destructive myth?

~SLOC

Let me flip this metric on its head, and state that, if anything, for a certain domain of project, and a certain class of developer, a high rate of SLOC can actually indicate poor programming practices.

In the nascent days of software development, many teams had a compiler or an interpret and that was pretty much it. They were responsible for building the majority of functionality from scratch. The pace of SLOC creation was tremendous (especially given that much of that implementation was trivial, allowing them to code as fast as they typed. Little time needed to be spent problem solving or planning: It doesn’t require a superstar to code yet another string copy function).

As time went on, organizations compiled volumes of reusable internal code for all of their domain specific problems.

From an individual developer perspective, no longer was it acceptable to simply “run and start coding”. Now you had to spend some of your time learning, assessing, and implementing shared internal code in your projects.

And it wasn’t just in house: The frameworks and libraries provided with our tools have been growing by leaps and bounds, immediately solving a huge range of traditional problems and tasks with well tested, robust, feature rich solutions.

In the industry as a whole, code sharing has become widespread, with excellent code being available for virtually all common (and even uncommon) tasks.

So many solutions are available in the industry and supplied within our libraries/frameworks, that even organizational code reuse can be indicative of a problem.

Yet somewhere out there someone is hand-writing an FTP client implementation. Somewhere developers are wasting a tremendous number of man-hours by poorly, and unintentionally, duplicating code that exists in the frameworks and libraries that they’re already using, or which can be easily found in license compatible open source projects.

Not Invented Here

A part of the reason for this is laziness — it’s a real bother having to look through the documentation and among search engine results, and that’s hardly as much fun as just coding. Another part of the reason is a classic perception flaw that virtually all developers suffer from: Endless optimism about the capabilities and quality of the code we produce — which we always think we’ll finish much quicker than we really will — coupled with an unreasonable pessimism about the applicability or worth of code we could source from another group in the organization, or from an external source. How could it ever compete with our imaginary idealized solution?

I’m often guilty of these failures of perception, as are the overwhelming majority of developers.

Conclusion

Rarely does a developer actually tread across new ground (and I’m certainly not just talking about business back-end “CRUD”developers — even in signal processing, embedded development, game development, and other less common branches of software development, most of the “solution” is the integration of existing work in novel ways, adding an envelope and façade of customization).

For the rest of us, our job is partly to generate the generally small amount of niche-specific code, usually aiming to build it with the most concise — aka minimal — code necessary, with the bulk of our time being in the analysis and integration of the extraordinary volumes of available solutions.

Where niche, custom code is necessary, generally it will be fora non-trivial task, and the SLOC pace will be unavoidably glacial.

For the overwhelming majority of developers in the industry, the only value of SLOC measures is as a warning sign, not an indication of progress.

On [!Pre]Mature Optimization

Joel the Troll?

Joel Spolsky, the well-known blogger and ISV owner, kicked up quite a storm recently with his piece entitled Language Wars.

The article leads off with some pragmatic wisdom, advising enterprise-y, low-risk type shops to use well-known and well-proven technology stacks — solid advice that’s hard to argue with — yet he then ends the piece with a comment about an in-house, next-generation, super-duper language being used to develop FogCreek’s premiere product, FogBugz.

The discord was so great that most readers presumed that the Wasabi thing was a joke, or alternately that the rest of the article was the joke (which would have been an awesome revelation). Much confusion ensued, to the point that Joel had to put up a post clarifying that he was actually serious about the Wasabi thing

Like Sharks, only with Ruby LASERs On Their Heads!

Aside from the seeming hypocrisy, what really instantiated some JoelCritic<T> instances (via the BlogCriticFactory) were Joel’s comments about Ruby, where he seemingly indicated that it wasn’t ready for prime time.

…but for Serious Business Stuff you really must recognize that there just isn’t a lot of experience in the world building big mission critical web systems in Ruby on Rails, and I’m really not sure that you won’t hit scaling problems, or problems interfacing with some old legacy thingamabob, or problems finding programmers who can understand the code, or whatnot…

…I for one am scared of Ruby because (1) it displays a stunning antipathy towards Unicode and (2) it’s known to be slow, so if you become The Next MySpace, you’ll be buying 5 times as many boxes as the .NET guy down the hall.

I’m sure Joel anticipated the backlash. Perhaps it was even the motivation behind the posting: The resulting torrent of discussion brought quite a few visitors to his blog, and earned him a lot of inbound links, both of which have definitely helped with his new business ventures. No publicity is bad publicity, they say, especially if it’s timed to coincide with the launch of a new job board.

Ruby is still new enough, and with a small enough community, that many of its users double as evangelists — think of the Amiga computer, the BeOS operating system, or any other contextually-superior alternative embraced by a small enough group that many feel an ego-intersection with the technology, motivated to defend and advocate it when the opportunity arises. Linux once had such an attack-dog core of rabid enthusiasts, though as the user base has grown, and it has become more pedestrian, you really have to target a Linux-niche (such as a little used distro) if you’re aiming to stir up a hornet’s nest.

That entire lead-up was just some context for the actual topic of this entry: So-called premature optimization.

On Premature Optimization

A common response to Joel’s complaint that Ruby is slow or resource inefficient is the frequently incanted declaration that such complaints are nothing but “premature optimization!”

I’ve seen the same deflection shield used to defend abhorrent database designs, convoluted, overly-abstracted class designs or message patterns, and virtually anything else where a realist might proactively ponder “but won’t performance be a problem doing it like this?“, only to yield the response “You know, premature optimization is a classic beginner’s mistake!”

If you don’t want to be lumped in with beginners, the lesson goes, it’s best to pretend that performance simply doesn’t matter. We’ll cross that bridge when we get to it.

Premature optimization is the root of all evil (or at least most of it) in programming.

I remember the early days of my software development career: I once spent about 16 work hours optimizing a date munging function, increasing its performance from something like 2 million iterations per second to 4 million iterations. In the grand scheme of things the performance difference was completely negligible, but from the perspective of artificial benchmarks it seemed like tremendous progress was being made.

That was premature optimization.

Indeed, anyone who’s done time in the software development industry can identify with what Mr. Knuth was saying, probably having been involved with (or responsible for) project plans gone awry when efforts focused on highly-complex caching infrastructures, or ultra-optimizing some seldom used edge function.

Yet what is arguable, and situation specific, is deciding what qualifies as premature, versus what is simply proactive, predictive, professional performance prognostications.

NOT ALL PERFORMANCE CONSIDERATIONS ARE PREMATURE OPTIMIZATION!

While there is no doubt that there is such a thing as premature optimization — it is an evil distraction that sidetracks many projects — there are critical decisions made early in a project that can cripple the performance potential (both resource efficiency, and resource maximum), making later optimizations enormously expensive, if not impossible without an entire rewrite.

Whether it’s heavily normalizing the database (or its nefarious doppelgänger, the classic database-within-the-database: “This single table can handle anything! Just put a comma separated array of serialized objects in each of the 256 varbinary(max) columns! Look at the flexibility! Query it Don’t you bother me with your premature optimizations!“), creating an application design that’s incongruent with caching, or choosing an inefficient platform.

There are credible performance considerations that need to be addressed at the outset, and revisited as development proceeds. It is absolute insanity, and entirely irresponsible professionally, to simply stick one’s head in the sand and hope that some magical virtual machine improvements or subcolumn indexing decomposition and querying technology will occur before deployment, or before the economics of scaling come into play.

And speaking of scaling, the canard that the horizontal-scalabilty intrinsic with most web apps (unless you really screwed up the design — as many people do — and made horizontal scalability impossible) makes the problem a nonissue is absurd: Perhaps if your project has a high transaction value then you have the luxury of adding more servers to serve a small number of clients, yet for most real-world projects adding resources is a big, big deal. And it isn’t simply the cost of a low-end Dell 1850: Whether you’re colocating or hosting in an expensively rigged corporate server room, the cost of each server is substantial.

You end up in the dilemma that you’re financially (or physically) limited to a set quantity of resources, having to limit or scale-back the functionality provided to each user due to the inefficiencies caused by early decisions. “Sorry we can’t implement that cool AJAX type-ahead lookups because the callbacks would kill our servers – we’re already saturating them with our stack of inefficiency, so there’s no overhead left.

I think the lackadaisical attitude towards efficiency is a result of experience derived from countless unvisited or seldom used web apps deployed across millions of PCs, colocated with equally as spartanly used peers. When a site sees a dozen visitors in a day, it’s easy to declare that performance is a seeming nonissue nowadays – that it’s only a concern for game programmers and nuclear modelling engineers. Then one day the page gets mentioned on Digg or Reddit or Slashdot or BobOnHardware and in that potential moment of glory the app falls over and dies, again and again.

None of this really has anything to do with Ruby. Personally I haven’t used it beyond the tutorials, though I do know that it does very, very poorly on the standardized benchmarks. However it is distressing seeing so many people dismiss Joel’s comments (or comments about Python, or ERlang, or XML, or any other technology) as premature optimization.