Pet Store 2011 – Metrics Instead of Evangelism

A Blueprint-Driven Implementation

Sun’s “Pet Store” eCommerce application, released back in the
early-2000s, was intended to be a blueprint of a best-practice
conforming J2EE application. Microsoft took that blueprint and
created a .NET simile, using it to "http://msdn.microsoft.com/en-us/library/ms954626.aspx">highlight
purported performance advantages of .NET
.

Much ink was wasted arguing the merits of the comparison, with
various players optimizing and re-engineering the entrants.
Eventually a more equal comparison was created by a J2EE consulting
company (The Middleware Company, which has since discontinued and
folded into other operations), who published an optimized Java
implementation, while Microsoft did the same with the .NET variant, the two facing off in the cage.
Microsoft’s platform still came up the winner, though "http://www2.sys-con.com/main/TMC%20Study%20Rebuttal.pdf">many
argued that the Middleware Company acted as a stooge for
Microsoft
.

There was still a lot of complaints — "http://blog.yafla.com/Sunspider_Results_Dont_Mean_What_You_Think_They_Do/">
no one is ever satisfied with a benchmark unless their favorite
wins
— but it was a somewhat fairer fight.

We need to revisit that model with the current available stack
of technologies. A Pet Store in Ruby on Rails atop MongoDB, versus
a .NET over SQL Server and a PHP over MySQL, and so on, all
featuring the same public RESTful interface and APIs, accommodating
the data needs however is seen fit.

Empirically measuring actual efficiency and performance
rather than simply taking a lot of hot air and evangelism as a
surrogate, which sadly is where we are today.

PHP Is Fast? Since When?

A CEO recently exposed their ignorance to the world in an essay
explaining "http://blog.expensify.com/2011/03/25/ceo-friday-why-we-dont-hire-net-programmers/"
rel="nofollow">why they don’t hire .NET programmers
. To say
that it was universally condemned is a fair assessment. Even
amongst anti-Microsoft camps there is a lot of appreciation that
.NET is actually a pretty good platform, definitely holding its own
among the top contenders, and that the author was simply a misled
bigot (who, I suspect, thought that such bigotry would play to the
crowd. Maybe in 2005 when Microsoft was the boogeyman and
everything they touched pervaded evil, but not in 2011 when they’re
the underdogs).

I was pleased with the response he got, but some off-the-cuff
comments surprised me. Some offhandedly commented on .NET’s supposed poor
performance relative to PHP, more than a few calling PHP
fast.

PHP is fast? Since when? I’ve been dealing with PHP and .NET
code (among others) in the stack for years, and the one adjective that I would
never use for PHP is fast. With various accelerators and hacks it
can be made workable with a big enough scale-out, but it is not an
efficient platform. Choosing PHP as a general rule means slower
page generation times, and with growth a larger and larger
scale-out need.

It still powers many of the top sites today (and enabled their
growth), so clearly it has a lot going for it, but speed is not one
of those things.

The problem with PHP is that it suffers from a "http://en.wikipedia.org/wiki/Schlemiel_the_Painter%27s_algorithm">Schmlemiel
the Painter
inefficiency, with a “start the world” processing
model that, in a reasonable sized application, does an astounding
amount of work for even trivial requests as your application base
grows (see the SugarCRM).
It’s for this reason that a “Hello World” PHP demonstration is so
terribly misleading and has little applicability to a real-world
use.

PHP offers a platform that is difficult to optimize because of
fundamental implementation decisions early on, where any include of
an include can, at any point in the execution flow, change the
state of the world, making it difficult to devise any strategy that
shares the pre-executed state among requests.

PHP is slow.

Of course there is "http://developers.facebook.com/blog/post/358/">Facebook’s “HipHop”
initiative
, which is a code generation utility that uses a subset of PHP as an input, the output being C++ (Wasabi!), which is then compiled into native code. To
do this Facebook had to follow a number of practices that limited
the ability of PHP to sabotage its own performance, making it
lifecycle compatible with such a transformation. The end result is
not PHP, however, and it does not practically carry over to any
other site.

It’s worth noting that several very large sites used C++
authored ISAPI modules for the bulk of their processing, so it can
and has been done directly many times before.

We Need Some Model Benchmark

When I engaged in the whole NoSQL debate previously ( "http://blog.yafla.com/Getting_Real_about_NoSQL_and_the_SQL_Performance_Lie/">[1]
"http://blog.yafla.com/Getting_Real_about_NoSQL_and_the_SQL_Isnt_Scalable_Lie/">
[2]
"http://blog.yafla.com/The_Impact_of_SSDs_on_Database_Performance_and_the_Performance_Paradox_of_Data_Explodification/">
[3]
), one of the primary complaints I had about the NoSQL
movement was the shocking lack of actual empirical metrics. Lots of
broad claims were being made, but remarkably little was actually
demonstrated. Just lots of unsupported claims about performance
that didn’t pass basic skepticism.

Of course it isn’t all about per request performance or runtime
efficiency. Development efficiency is critically important, and a
product like MongoDB can be a hugely efficient development target.
But isn’t it better to work with the real numbers, making decisions on fact instead of emotion?

It would be ideal to agree to some common interface,
functionality and API patterns of some model applications
(eCommerce Pet Store 5.0, social news Reddit simile, Twitter
simile), and then let the evangelists loose to create test-passing implementations in their platform of choice.

The benchmark platforms can then be evaluated for performance
and for efficiency (Reddit runs a monthly cloud server charge of
something like $35,000, with a result that they have a terribly unreliable, often unresponsive site. Efficiency is *hugely* important even if it seems cheap at the outset to throw hardware at the problem), and operational resiliency. Most importantly they could be evaluated from a security perspective, which is a grossly ignored aspect that has
reared its head time and time again, in some cases destroying organizations because they eschewed basic security practices.

Demonstrate that your advocated solution serves up pages quickly and efficiently, on a resilient, secure platform, and win the argument.