When doing activities that impact the web site presentation ofprojects I’m involved with, I occasionally hop down to the menuitem “Validate Local HTML” in Firefox, a function that is availablewhen you have the web developmenttools (you can also access it via Ctrl-Shift-A, and of coursecan always run it directly,but that seemingly tiny improvement in ease and efficiency of utilization candramatically increase the usage of it). In a weak sort ofTDD, itis a constant sanity test of at least the fundamental HTML validityof the generated presentation, and I always strive to get it to therewarding green no-errors-no-warnings state.
Does it really matter though Ultimately what reallymatters is if the site renders as close to as expected as possiblein the major browsers, and most of them happily overlook evenegregious errors (Internet Explorer was criticized early on forbeing so forgiving, but given its dominance the other browsersreally had no choice but to allow the same sloppiness. Most webpublishers weren’t about to re-engineer their site just to ensurethat it displayed correctly in Opera, for instance.)
Out of curiousity I decided to check some other sites to see howmany ensure that their (X)HTML is clean. The following are theresults as they stand at this moment, though of course as contentis added or removed the state will change (though a clean site isoften a clean site with intention, and new content is automaticallyfiltered to ensure that it is pure).
- FAIL –http://www.reddit.com – 36errors as XHTML 1.0 Transitional. EDIT: RecheckedReddit, and now it’s a PASS
- FAIL –http://www.slashdot.org – 167errors as HTML 4.01 Strict
- FAIL –http://www.digg.com – 32 errorsas XHTML 1.0 Transitional
- FAIL –http://www.cnn.com – 40 errors asHTML 4.01 Transitional (inferred as no doctype was specified)
- FAIL –http://www.microsoft.com -193 errors as XHTML 1.0 Transitional
- FAIL –http://www.google.com – 58errors as HTML 4.01 Transitional
- FAIL –http://www.flickr.com – 34errors as HTML 4.01 Transitional
- FAIL –http://ca.yahoo.com – 276 errorsas HTML 4.01 Strict
- FAIL –http://www.sourceforge.net– 65 errors as XHTML 1.0 Transitional
- FAIL –http://www.joelonsoftware.com– 33 errors as XHTML 1.0 Strict
- FAIL –http://www.stackoverflow.com – 58 errors as HTML4.01 Strict EDIT: Rechecked and now it’s a PASS
- FAIL –http://www.dzone.com – 165errors as XHTML 1.0 Transitional
- FAIL –http://www.codinghorror.com/blog/ – 51 errors asHTML 4.01 Transitional
- PASS –http://www.w3c.org – no errors asXHTML 1.0 Strict
- PASS –http://www.linux.com – no errorsas XHTML 1.0 Strict
- PASS –http://www.wordpress.com -no errors as XHTML 1.0 Transitional
(I searched around for more good examples to sit in the PASScategory, but sadly they are very few and far between)
Should this be normal?
No, it shouldn’t.
Some of the errors in some of the mechanically generated HTMLare simply unexcusable, and testify to the general level ofsloppiness in the web industry in particular.
Check your HTML. Ensure it conforms to the specs it purports toobey, or accept defeat and step back to a less-demanding level.With tools like one keystroke validation and auto-cleanup HTML Tidy (which is available inmodule form, allowing you to auto-cleanup content mechanicallyinline in your site code – see this entry for an example of using Tidy from .NET code),there’s simply no excuse.
Many will wave off such criticism, declaring that if it rendersfine that’s what really matters. Yet the worry about purity hasmore to do with the code maintenance process, and ensuring that anappropriate amount of care and concern is put into the product, inmuch the same way that you should strive to have 0 warnings in yourprojects, even if the compiled output works fine regardless. In thesame way that I try(albeit with failures at time) to ensure that I avoid misspellingsand typos, even if the message could be successfully conveyedwith them.