As someone who has been teaching HTML for over a decade, I have recollections of students (authoring in Notepad) in total panic because they forgot to close their </table>s, and thus Netscape 3 would deliver a blank page (Draconian error handling has a history too) – I also remember that they only made that mistake once, and never forgot it after that.
Let’s put aside un-escaped ampersands – it’s really too bad that this particular rule exists – but also remember that in all flavors of html going back to HTML 2 the requirement for escaped ampersands existed, so blame it in part on later scripting languages rather than HTML; however outside of that particular conflict, why should we, in 2009, accept second best? (We could relax that particualr rule in HTML5 and then still demand validation)
In the early days of the web, all we were doing was marking up text, but today (especially with the advent of HTML5) we are building web applications: sophisticated and complex interactive designs and tools that leverage all sorts of content and content negotiation. How can we on one hand aspire to a “One Web” ideal and yet at the same time accept whatever effluent comes washing down the pipes and hope that it will work? Why should HTML5 (and compliant browsers) need to spend so much time on silly error handling (as opposed to critical fails), when the simpler answer would be – get it right or it simply won’t work? That this statement is heretical to many is a complete mystery to me – why shouldn’t we expect that for it to work you must get it right? That friends is the way of the world.
I would venture to guess that the majority of pedestrian user ‘web content’ today is not being hand coded in Notepad, but rather via authoring systems/CMSes such as Drupal or WordPress (like this blog) or WYSIWYG tools that handle much if not most of the back-end code beyond the reach of the uninitiated. Why shouldn’t we expect these tools to get it right? And more importantly, why can’t these tools help authors correct their mistakes before ‘publishing’? We have no problem integrating spell checkers into virtually every authoring tool imaginable, yet somehow the notion that we can’t do the same for syntax validation in the same tools seems somehow unachievable of untenable? Why? I don’t know – frankly the rules really aren’t that hard to learn or apply.
As for ‘professional’ web developers – are you really PROUD that you can skate by generating sub-optimal work and can get away with it? How would you feel if your PROFESSIONAL car mechanic did sub-optimal work? Or your PROFESSIONAL plumber? Or your PROFESSIONAL payroll clerk? I’m truly confused – why the resistance to being the best that you can be? Why hide behind “…if it renders on screen that’s good enough?” Is it really?
I think that we need to find a middle ground, one that gets beyond un-escaped ampersands (which BTW flags as an error and not a failure in the W3C validator), but one that insists that if you want to be an author then good grammar and proper spelling is a baseline requirement, and that valid code be a minimal expectation. After all, all other related ‘languages’ used in today’s modern web environment require some level of validation: CSS rules must be properly declared, and the cascade is important; PHP and ASP syntax is critical for your app to work (“error_reporting = E_ALL & ~E_NOTICE” anyone?), and more esoteric development platforms (Ruby on Rails, JavaStruts, etc.) also have relatively strict authoring requirements if you want something to render on screen. The notion that tag soup in HTML is acceptable goes against the precedent of all other CS languages out there today.
Why? Because that’s the way it used to be? That’s hardly a good excuse in my opinion: you used to also be able to smoke in airplanes, seatbelts in cars were optional, and corporal punishment in primary schools (complete with nuns with rulers wacking knuckles) were all considered ‘the norm’. Today, we know better, and have modified our society to reflect this knowledge. I posit that, with the move to HTML5, now is a great time to up our game as far as validation is concerned: it will help deliver the “One Web” faster, more efficiently, more predictably and with a lower user-agent overhead. You want tag soup? Here, use this DTD:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN” “http://www.w3.org/TR/html4/loose.dtd”>