Web Workers and You – A Faster, More Powerful JavaScript World

Web Worker Benchmark – Moonbat

If you’re running Firefox 3.5 or Safari 4 [EDIT: Or Chrome3.0], take a look at the “benchmark”/technology demo I just put up. [Safari 4compatibility added based upon the great comment submitted byOliver]

It’s a modified variant of the SunSpider benchmark that I’vewritten about before (in less than flattering terms), which Iheavily altered to utilize the remarkable new WebWorker functionality you can nowexplore in Firefox 3.5. If you’re really analyzing performance,be sure to disable Firebug as it significantly impacts theresults.

Google GearsWebWorkers, a standardization of a feature of Google Gears, are a remarkablysimple method of multi-threading JavaScript, not just to get it outof the UI thread — where it can be very detrimental to the userexperience as the interface freezes while a script runs — but alsoto scale across multiple CPUs and cores on modern PCs, which whileseemingly a ridiculous notion (“but it’s just JavaScript!Multithreading?”) is becoming a real concern as the JavaScriptengines continue to advance and the usage and scope of the languageand related technologies continue to expand.

Through a simple, synchronized message passing system and aminimalist API, the Web Workers model lends itself to robust,elegant code that isn’t as prone to classic multi-threadingpitfalls. While not a part of the current instantiations, in atheoretical implementation there is no reason why web workerscouldn’t be located on entirely different machines, given that eachworker is essentially an isolated runtime, sharing very little (thenavigator properties and some basic security info for things likeenforcing XmlHttp restrictions), communicating via serializedmessages.

Understanding the Benchmark

The benchmark/technology demo is operational in Chrome, Opera,and Internet Explorer, but only if you change Web Workers to 0. Inthat case it is sequentially running the set of tests in the mainthread, as JavaScript has traditionally been run. I didn’t intendfor this to be used for cross-browser comparisons, even if I resortto presenting just such a comparison at the end of this entry, andinstead the focus is really on the technology, so the real “power”is seen once you start to turn up the web worker dial, all the wayto 11.

SafariWeb workermultithreading isn’t limited to Firefox 3.5. Oliver left a commentpointing to a Safari-ready variant he threw up, so I modified thetest accordingly (the difference being that when Safari implementedit, it didn’t intrinsically include JSON encoding, so your callerand receiver had to do that themselves). I didn’t realize thatSafari had covered this ground, though it isn’t shocking given howrapidly that browser has been advancing.

With one web worker, the UI remains fully responsive to userinteraction, which is an experience quite unlike what was seen at 0(where the browser essentially locks up during the run), and theactual run itself suffers little for the isolation. On a quad-coreCPU, the CPU usage during the duration of the test cycle fluctuatesaround approximately 25%.

At two web workers, the individual tests take slightly longer torun, however the actual completion and pace of the tests in thewhole is greatly improved. Not quite a halving of the runtime, butnot too far off. Two cores are saturated during the duration of thetest.

At three web workers, three of the cores are filled with work,and the total elapsed time improves somewhat, albeit not by theratio that correlates with the 50% increase in computationpower.

At four web workers, we’ve tapped out the parallelism anddespite all four cores being saturated for most of the duration,the total runtime actually suffers slightly. Going above fourdoesn’t cost much, but it also brings no real gain (beyond possiblealgorithm gain isolated various parts of the application).

You can also run a mode where instead of running a modified jsdirectly in the worker thread, the code is passed as a stringparameter, eval’d into a function reference, and the function isrun. There are some interesting observations to be observed by thistest, such as the lack of tracemonkey loop optimizations on eval’dcode (see bitwise-and in particular. It suffers dramaticallywhen run as an eval’d function relative to running as literalJavaScript). This surprised me as the eval merely instantiates afunction in the current context, but doesn’t run it, yet theperformance penalty remains because it was sourced from aneval.

Here are some results for 1-8 threads, running 10 cycles of eachtest, gathering the total elapsed time in Safari 4 and Firefox 3.5RC2. This was run on a quad-core Q9400 machine, and of course yourmileage will vary. While it is evident that Firefox 3.5 is usingmore of the available processing power as you move past 1 thread,with it increasing from 25%, 50%, 75%, to 100% at 1, 2, 3, and 4threads respectively, it doesn’t fully benefit from the additionalresources, yielding a greatly diminished rate of return. Safari, onthe other hand, already started with a considerable lead, and itpulled away with each thread up to the optimal 4, really hittingits stride.

Multiple threads in Safari and Firefox 3.5

I’ll add some charts and the like to this entry later, but justthought I’d drop a line on that demo of a very promising technologythat will soon see fairly robust deploymet (one huge benefit ofFirefox — shared by Chrome and Opera — is that the uptake ratefor new versions is extremely high).