“Web Apps Suck Because HTTP is Stateless…”

“That design might work for a stateful desktop app, but it
isn’t appropriate for the stateless web.”

“O/RM isn’t appropriate for stateless environments like
HTTP!”

“This component wasn’t made for the stateless environment of
HTTP!”

“…but HTTP is stateless!”

If you’ve done any sort of web development, you’ve probably
heard proclamations like these. You may have even made them
yourself.

But what do they really mean? Do they add any value to the
conversation?

So What Does Stateless Mean Anyways?

Stateless refers to an architecture where each HTTP
request is fundamentally detached from requests that came
before, and unrelated to requests that will follow.

In a stateless world, the browser initiates a TCP connection on
port 80 – traditionally, or port 443 if it’s a secure
connection – and then sends some basic commands, such as
the desired document (e.g /images/coolpicture.jpg),
along with this-request preferences like the user’s desired
language.

With no prior information about the caller – acting
only on the newly generated information in the request (e.g. the
document requested, along with user submitted form values)
– the server sends the results.

> GET /images/coolpicture.jpg

< the binary data for /images/coolpicture.jpg..

After the single request is serviced, the connection is torn
down in this stateless scenario. The desired goal was to service
each request as quickly as possibly, freeing the resource-heavy,
finite-quantity connection to service other callers.

Maximum output with minimum resources.

This served the early web very well. "http://en.wikipedia.org/wiki/Mirsky%27s_Worst_of_the_Web">Mirsky’s
Worst of the Web
could be served out to thousands of
anonymous consumers with gusto on minimal hardware, fulfilling the
liberal information sharing origins of HTTP.

Stateless In The Non-Internet World

For a historic analogy, think of the 411 telephone service – you
dial the number and establish the connection. You tell the
operator the person whose number you require, and they provide a
number in response. The call is disconnected, freeing the line and
the operator for the next caller.

This is stateless in that the service relies upon no contextual
information preceding the call to provide the service, allowing a
small number of operators and connections to handle a large number
of lookup requests, needing no resources beyond a simple phone
book.

A stateful 411, on the other hand, would be one where you called
411 and left the phone off the hook, maintaining the connection for
perhaps days at a time. With each number lookup request, they would
try to interpret what you really mean based upon the
requests that came before.

“Earlier you asked for a bait store on Main street, and now
you’re looking for a tackle store. I’m going to guess that you
probably want one on or near Main street. The number
is…”

Such a stateful connection wouldn’t even require you to maintain
the call – they could just pull up your records based upon the
calling phone number, immediately having the history of your
interactions to draw from in a stateful manner, regardless of the
transience of the individual call.

Stateful Back In The Internet World

The stateless definition of HTTP was used to contrast
with existing services like telnet and FTP, where a TCP
connection (itself a stateful protocol) was made, after
which a state was maintained and modified from command to
command — whether you were logged in, what directory you were in,
what application was running, and so on.

The state was alive and changing until the connection was
dropped, with a block of server resources dedicated to keeping
alive a world just for you.

That design worked for those services because connections
were generally “higher value” per request – a long running file
transfer that couldn’t serve many clients anyways, as a function of
the large number of bytes per request; a professor running some
batch jobs; etc.

Bridging the Gap

Most readers will know that almost all websites these days
appear to be stateful.

You log on. It presents data that is specific to you, using
preferences that are individual to you. As you do things, the
environment changes and adapts, incorporating your interactions
into following requests.

This isn’t just an illusion, or a bastardization of the
web: THESE WEBSITES ARE STATEFUL.

So how did the web sneak up and become stateful on everyone?
Well, generally via the magic of cookies (alternately via
URL-appended session identifiers to simulate cookies), an addition
to the HTTP protocol that was first "http://wp.netscape.com/newsref/std/cookie_spec.html">implemented by
Netscape
 back in 1995.

A session cookie is often nothing more than a unique identifier
(preferably with enough entropy that users can’t guess each other
session identifiers, for instance a randomly generated GUID),
passed to the server on each request, allowing the web server to
tie requests together, building a set of session data to provide
state for a given client –  The logon form changes
the home page render changes the topic listing
changes the calendar selector changes the news
view
, and so on, with each page having available a set of
stateful information about the client, forming a sort of virtual
“persistent connection” over many individual, seemingly isolated
HTTP requests.

“Ha! Got You! There Isn’t A Constant Connection! So It’s
Stateless!?”

Ignoring the fact that in the modern world HTTP connections
are
reused
(given that a client will often request dozens or more
documents to build a single page – or in the case of "http://www.digg.com/">Digg about 37,528 – it was found to be
cheaper to just let the client reuse a built connection for
multiple requests), often people differentiate HTTP from being
“stateful” because it doesn’t maintain a constant connection for
the entire session.

Yet what is a connection? In this case it would be TCP, a
“stateful” protocol. TCP is stateful in that it changes based upon
what has happened before, and each packet for the duration of a
connection relies upon those before them getting through okay.

You can establish a connection, let it sit for a while, and
occasionally pass data back and forth.

TCP is stateful in contrast to IP (or its very
light encapsulation, UDP), which is individual packets that live or
die by themselves, with no consciousness of packets that came
before, or those that will follow.

But wait, isn’t it TCP/IP? TCP on top of
IP?

Why yes, it is. TCP is fundamentally “IP with cookies”, allowing
it to maintain session state, tying many stateless packets together
into a nice, clean stateful correspondence. This differs little
from HTTP with cookies, a fundamentally stateful protocol when
coupled in virtually any post-1996 implementation, where the idea
of sessions and statefulness are the norm.

The Web Isn’t Stateless!

So why does everyone keep yabbering nonsense about HTTP being
stateless (pedantically true, but practically irrelevant
and entirely misleading)? Why do so many people talk about
the web being stateless in the face of endless contradictory
evidence?

I think it’s just a cop out: People want to validate their
crappy web apps – possibly due to laziness or a desire to
migrate back to fat apps – so they clutch onto the
justification that it’s a fundamental limitation of the platform
that limits their abilities, constrains their design or forces them
into hackish implementations.

In reality, the web that we’ve been developing against for the
past 10 years has allowed tremendous statefulness, including
building up and maintaining enormous quantities of
server-side state for every session (just like a fat app or a DCOM
component): Just because that isn’t appropriate for a very high
volume, low value-per-transaction anonymous user website should in
no way guide you in your implementation of a low user
count, very high value-per-transaction vertical market web app.

You have the ability, and the mandate, to do what’s
right for the problem, and no one solution or dogma fits all
web needs.