Sunspider Results Don’t Mean What You Think They Do

Step 1 : Get Lots of Links By The Enraged Faithful

Blaze, an Ottawa-based web optimization company, drew from the play book of easy publicity by publishing the results of a benchmark pitting Android and iOS against each other in a webpage loading deathmatch.

Android was the victor, hitting the finish line (the onload event) an average 34% faster (2.1 seconds versus 3.2 for iOS).

The test measured the elapsed time until the onload event fired loading a selection of Fortune 1000 websites, which as a generalization average towards being simpler sites that are less exploitative of recent advances in web technology.

More important still, the actual comparison was between the Nexus S and the iPhone 4 as proxies for their respective operating systems. The Nexus S being the 1Ghz Hummingbird-processor equipped Samsung device pitted against the Apple iPhone 4 running a very similar processor — the A4 and the Hummingbird are twins separated at birth, both designed by Intrinsity and Samsung — at a slower 800Mhz.

All else being equal, I would expect that in this matchup the iPhone 4 would lag behind by a good 25%.

Of course all else isn’t equal: the operating systems are completely different, and the browsers greatly diverged from their shared WebKit roots. Yet a number of comparisons of recent incarnations of Android and iOS on similar devices have found their browsing performance converging: In most cases iOS gives the sense of progress quicker (showing partial pages earlier), while Android tends to complete rendering first. Android has made big gains from 2.1 to 2.2, and again from 2.2. to 2.3.

The Android 2.3 browser is fantastic, and is an absurdly responsive platform. Indeed, the whole operating system has gone from being a endlessly halting stop-the-world garbage collection demonstration, to a smooth as butter, refined UI.

So in this case an Android device with the latest OS on a faster processor comes out on top. No big surprise, so no big controversy,right?

Hardly. Turns out it was a pretty big controversy.

The iPhone Loses This Shall Not Stand!

It turns out that these results riled up a hornet’s nest.

And on, and on, and on. Pretty extraordinary how reactionary the defensive hoardes are whenever Apple or the iPhone comes out the loser (look at some of the enraged comments below the Blaze pages — this is an absolute embarrassment to the technology world, and these raging defenders are the reason online comments are seldom of value or insight). There have been countless ridiculously flawed surveys covering the Android and iPhone ecosystem, and they are trusted and repeated verbatim so long as the iPhone is the beneficiary.

I think my favourite ridiculous defense came from Michael Gartenberg (the go-to guy if you’re looking to fluff up some empty tech story)-

“Consumers don’t buy smartphones by going into the store and timing web-browsing experiences with their stopwatch,” Gartenberg said. “I don’t think this has any impact on the buying public, per se. The headline becomes a lot more misleading than the actual performance because performance gains of less than a second just don’t matter.”

What a silly statement: The desktop browser wars have been waged on the battles of milliseconds here and there. Android’s biggest deficiency prior to Gingerbread (2.3) was interface halts that lasted generally in the order of 60ms or so, and this was something that virtually every user rightly complained about. If these results held, and one device imposed a one second+ tax on each and every page load, that is very significant.

Benchmarks Are Always Flawed

The benchmark is far from perfect. Immediately I would question the choice of target websites: The top websites that people actually visit, particularly on smartphones, have little correlation with the Fortune 1000, and the former will tend to be far more dynamic and leading-edge than the latter. Alas, most benchmark are deeply flawed.

The common complaint about this benchmark, however, has been to argue that the study is invalid because it used the embedded browser control which while obviously heavily overlapping with the full browser code base (if not directly using the same binaries) —if Apple knows how to develop software properly, which evidence seems to show that they do better than most — has some features toggled off for various justifiable reasons. This apparently yielded the loss of a couple of optimizations, namely JIT NitroJavaScript compilation, and the use of HTML5 async attribute behavior (something that no more than a minute fraction of those Fortune 1000 sites use, if any. I would place odds on zero of them making use of it).

Says Daring Fireball-

It’s easy to see that Mobile Safari is faster than UIWebView — just run something like the SunSpider benchmark twice, once in Mobile Safari and once in any app from the App Store with a web content view. On my iPhone 4, Mobile Safari runs SunSpider almost three times as fast as an app using UIWebView.

I think there’s a general misunderstanding of the benefits of JIT compilation, especially in the context of a test such as this.

The study also found that despite significant JavaScript performance gains in the latest Apple iOS 4.3 and Google Android 2.3 releases, they made no measurable improvements onpage-load times at the sites tested.

Of course they didn’t! I would actually expect that JIT compilation/tracing profilers were actually a minor net negative for the performance of simple webpages of the sort you’re going to find hitting up the corporate websites of the Fortune 1000. Only dynamically rendered AJAXy sites like Rdio.com are going to see “render” benefit from a JIT JavaScript engine, but if you’re simply measuring to the execution of the onload event, on pages where it’s rare for the same piece of JavaScript to ever get executed twice, the return just isn’t there.

JavaScript Speed Is a Rough Approximation of Overall Performance Focus

Firefox 4, released today and highly recommended, is a vastly improved browser in almost every way over its predecessors. The one thing earning the most attention, however, has been the performance of the JavaScript engine. For computationally intensive JavaScript uses it is a dramatic step forward, representing a huge improvement of an already optimized platform.

Sites that demonstrate the benefit of these JavaScript improvements are very few and far between, however, and are far less prevalent than benchmarks that show dramatic gains.

Go ahead, completely turn off JIT compilation (about:config,filter on jit and toggle all of the .content items to false) and see what the impact really is. You’ll find that it’s still a shockingly speedy browser, even though it is suddenly a Sunspider weakling.

Addons gain from JIT, though of course that is completely immaterial to an embedded browser control on a mobile device.

Everything in the browser is faster and smoother and slicker, having nothing at all to do with the JavaScript engine. Pages render quicker because of a greatly improved, native rendering engine (written in C). Hardware acceleration, retained layers,native property representations … these are the nothing-to-to-with-JavaScript advances that make it a much better, smoother browser than it was before.

Javascript performance is, however, often an important indicator of where team attention lies. Chrome really shot to the lead in the JavaScript war, also bringing a slick and speedy, responsive browser: Their focus was on speed, and it shone through in both aspects of the browser, however it’s a loose correlation. Safari on Windows, as a counterpoint, is an atrociously slow pig of a browser that feels like someone toggled your PCs turbo button off, yet it has an amazingly fast JavaScript engine.

SunSpider results are hugely overemphasized. They will matter more in the future as JavaScript on the sites that we use grows more complex, especially once JavaScript starts getting delivered in a pre-compiled intermediate language form, however it’s surprising to see how many intelligent people expected so much from Nitro’s very impressive JavaScript speedups in a simple test measuring basic page load times. Your browser is not written in JavaScript (yet…), so some perspective correction is in order.

Blaze Wins Regardless Of The Criticism

I commend Blaze, however. They played the media and community like a fiddle, and their PageRank will benefit heavily as a result. They do offer an interesting service which many of us were completely unaware of before.