If you are still using HTML4 for your new websites you should stop probably stop. And I’m not saying this because of standards compliance, or some sort of web snobbery. I’m saying this because it is a pragmatic thing to do. HTML5 is just a better standard – and not just because of the fun features you see in the online demos.
While the canvas stuff, the local storage and embedding native video is really cool, I think the biggest improvement HTML5 has on it’s predecessors is the new semantic tags you can use to organize your content.
W3C actually studied how modern websites are designed, and saw that most HTML4 content follows a similar pattern. Here is an example of a very basic HTML4 Strict web page skeleton I just threw up in five minutes. It could be start of a blog template for example:
Note the structural elements in the body of the page. We have a document header on top, a page footer on bottom and a big content div in the middle. Inside said content div we will commonly have “post” or “article” divs, which contain more structural elements: post heading, post content, post metadata.
All of this is accomplished via the magic of the all purpose div and span tags. This works well, and is actually what you are supposed to do in HTML4 but it gives you messy code. You are mostly using id and class attributes to give shape to your structure, and the unfortunately they are only visible on the opening tag. It is not uncommon to see websites which include snippets like this:
Or rather it is common to see something like this if you wrote the code yourself, of are working with good guy web designer. Chances are you will probably see something more like this in the code you inherit from other people:
Because, you know – someone broke the layout and just started closing div tags on the bottom until things stopped being broken. This is not as much of a problem as it is an aesthetic mess.
Not to mention that while everyone seems to follow this header/footer, content, articles type layout there are no rules on how you should implement it. We all just make up our id and class names. For example, navigation section can be called “nav”, “navigation”, “site-links”, “sidebar”, “menu”, “buttons” or just about anything. The main content area may have an id that says “main”, “content” or maybe even “content-area”.
When you are reading a CSS file for a HTML4 page you have to guess whether or not a rule using the #foot selector is referring to page footer, post footer or maybe something else entirely.
HTML5 more or less makes that sort of messiness go away. Observe the same skeleton written in proper HTML5. Can you spot the difference:
Welcome
Post Title
Lorem ipsum dolor sit amet...
Note how this example drastically cuts down the number of id and class attributes, instead relying on semantically meaningful tags. Everything always has headers and footers and now we have tags designed specifically for that. Blogs and news pages always need to box in and contain blocks of info into visually separate units, and now they can do that using the <article> and <section> tags. Navigation sidebars and pull-down menus are a web staple and they are now standardized as <nav> and <menu> respectively.
Hell, HTML 5.01 even introduces a <main> tag to replace the ubiquitous and ever-present <div id=”main”>.
The CSS you write will actually benefit from these changes. Let me give you an example of how I would probably write the stylesheet for my HTML4 example:
Essentially its a mess of selectors on id and class attributes. Note how I didn’t even bother to try narrowing down my selectors. For example, the smart thing to do would be to use a #post .heading selector for my post headings as it would both improve readability of the CSS and ensure that this particular rule will only get applied within the post body. But… I’m lazy, and I know that I won’t use the .heading class anywhere else so I won’t bother. This will likely create some headaches down the road, but who the fuck cares, right?
Compare this to the selectors I would write for my HTML5 example. Note how now that I got rid of the id and class attributes I actually have to explicitly specify parent-child relationships and how this improves readability?
I can glance at this and know that the styles from article are going to cascade down to article footer. There was no easy way of establishing that sort of relationship between #post and .metadata in the previous stylesheet because I was lazy and I didn’t bother.
This is what HTML5 brings to the table: improved readability of code and CSS, meaningful semantic tags and better, more intuitive ways of structuring your pages.
Nice post! Coming from a LaTeX world I was always amazed by the absence of more useful tags in HTML, and did not really want to learn proper webdesign. But with this, I’ll have a look at it again.
Well, there was always this idea that HTML should be mostly for structure and not for styling. So technically you should be able to lay out a basic page with just heading tags, divs and paragraphs and then use CSS to make all of that look presentable.
I guess the lack of more semantic tags was in part due to the fact that when original HTML specs were drafted we did not know how intricate websites will become in the future. That and IE6 sort of thwarted any progress on the web for over a decade. HTML5 is a big step forward, toward cleaned and more semantic web.
Nice post! Coming from a LaTeX world I was always amazed by the absence of more useful tags in HTML, and did not really want to learn proper webdesign. But with this, I’ll have a look at it again.
@ Joscha:
Well, there was always this idea that HTML should be mostly for structure and not for styling. So technically you should be able to lay out a basic page with just heading tags, divs and paragraphs and then use CSS to make all of that look presentable.
I guess the lack of more semantic tags was in part due to the fact that when original HTML specs were drafted we did not know how intricate websites will become in the future. That and IE6 sort of thwarted any progress on the web for over a decade. HTML5 is a big step forward, toward cleaned and more semantic web.