HTML5: New HTML5 semantic tags

Note: Eine deutsche Version des Artikels findet ihr bei amazingweb

In the early days of web development, HTML was used to define the content, structure and  appearance of a website.
With the introduction of CSS, there was a migration to a kind of Model-View pattern where CSS was responsible for the view i.e. the appearance and HTML only for the content and structure i.e. the model.
Using JavaScript (and especially with the many JavaScript libraries currently available), it has gradually evolved a Model-View-Controller pattern, where JavaScript has assumed the Controller role.
But the blending of content and structure was still there. The problem is not only that both content and structure are defined in HTML, but that there was no real way to separate or distinguish them.

The structural aspect of HTML, before HTML5, was usually reduced to a tree of div and span tags. This is nothing more than a grouping of parts of the site, which mostly results from the need to address these parts separately in CSS or JavaScript. So it was more a presentation or controller-driven organization of content, but in no case a semantically-related structuring.>

HTML5 now addresses this very issue. HTML becomes semantic. Specifically, it means that in HTML5 the different logical parts of a web page can now also be defined with dedicated HTML tags. Most websites have such a structure (or a relatively similar structure):

web site structure

Such a structure is typically encoded as:

<body>
    <div id="header">
        <h1>Titel</h1>
        <h2>Untertitel</h2>
    </div>
    <div id="navigation">
        <div id="menu_item1" class="menu_item">...</div>
        <div id="menu_item1" class="menu_item">...</div>
    </div>
    <div id="content">
        <div id="post1" class="post">...</div>
        <div id="post2" class="post">...</div>
        <div id="post3" class="post">...</div>
        <div id="post4" class="post">...</div>
    </div>
    <div id="sidebar">
        <div id="widget1" class="widget">...</div>
        <div id="widget2" class="widget">...</div>
        <div id="widget3" class="widget">...</div>
        <div id="widget4" class="widget">...</div>
    </div>
    <div id="footer">
    </div>
</body>

So there was already a semantic structuring of the web pages before HTML5. But there was no standardized structure, so everyone could do it in a different way and it was not possible to parse it universally.
The only truly semantic tags that were in HTML4 were the header tags (h1 to h6). They were unfortunately often understood as formatting tags, even if slowly their interpretation by search engines made them more semantical again.

New tags were defined in HTML5 to allow replacing most of the div container structure and encode web pages using a standardized page structure. In HTML5, the same page would look like this:

<body>
    <header>
        <hgroup>
            <h1>Titel </h1>
            <h2>Untertitel</h2>
        </hgroup>
    </header>
    <nav>
        <div id="menu_item1">...</div>
        <div id="menu_item1">...</div>
    </nav >
    <section>
        <article>...</article>
        <article>...</article>
        <article>...</article>
        <article>...</article>
    </section>
    <aside>
        <section>...</section>
        <section>...</section>
        <section>...</section>
        <section>...</section>
    </aside>
    <footer>
    </footer>
</body>

The nav tag can either be positioned within the header tag or outside. It depends on how exactly one understands the header tag.

The section tag is relatively similar to the div tag. The difference is that the section tag also means that the contents being combined are also linked thematically which was not always the case with the div tag.

The header tag can of course also be used in an article or section tag to define the title of an article or the widget name. It is also interesting to see that in modern browsers the appearance of h1 to h6 tags is different depending on what level they are defined in. The following:

<body>
    <section>
        <header>
            <h1>Articles</h1>
        </header>
        <article>
            <header>
                <h1>Article</h1>
            </header>
            <section>
                <h1>Section</h1>        
             </section>
         </article>
     </section>
</body>

Looks like this:

Different H1 sizes

These header tags have again a semantic meaning and are no longer pure formatting elements.

This new HTML5 structure is nice and semantically meaningful. But it has one drawback: Older browsers do not understand it. So often the two structures are combined. The older browsers simply ignore the new tags, they do not know and will only see the old structure. But newer browsers see both structures, which is ugly (but if you must support older browsers, you’re used to compromises anyway). Such a combined structure would look like this:

<body>
    <header>
        <hgroup>
            <div id="header">
                <h1>Titel</h1>
                <h2>Untertitel</h2>
            </div>
        </ hgroup>
    </header>
    <nav>
        <div id="navigation">
            <div id="menu_item1">...</div>
            <div id="menu_item1">...</div>
        </div>
    </ nav>
    <section>
        <div id="content">
            <article>
                <div id="post1">...</div>
            </article>
            <article>
                <div id="post2">...</div>
            </article>
            <article>
                <div id="post3">...</div>
            </article>
            <article>
                <div id="post4">...</div>
            </article>
        </div>
    </section>
    <aside>
        <div id="sidebar">
            <section>
                <div id="widget1">...</div>
            </section>
            <section>
                <div id="widget2">...</div>
            </section>
            <section>
                <div id="widget3">...</div>
            </section>
            <section>
                <div id="widget4">...</div>
            </section>
        </div>
    </aside>
    <footer>
        <div id="footer">
        </div>
    </footer>
</body>

Of course, with such a structure, the question is justified, whether it even makes sense to use the HTML5 semantic tags. There are basically 2 cases:

  • If you do not have to support older browsers (lucky you !), you can simply replace the old div tags with the new semantic tags.
  • If it is not the case, even though it is more work and the legibility suffers however from a functional perspective, you lose nothing compared to the old structure and the new HTML5 tags will slowly play an ever increasing role for search engines. With them it could be ensured that visitors arrive mostly because of the content on the page and not because there is something in the sidebar. The bounce rate would thus decrease.

Moving to the new HTML5 tags is recommended in the long term in any case, and is not a huge effort on a more technical side.

6 thoughts on “HTML5: New HTML5 semantic tags

      1. of course you are right, what I’ve meant is
        HTML is the DSL for content presentation, but the semantic tags listed by you are specific for building a page with a content, navigation, etc.

        1. Since most of these tags are basically replacing some of the DIV jungle present in many web sites, it’s on one side it can be seen as a simple enrichment of the language: imagine a language (which might sound familiar to you) where you do not have a dedicated word for plane, they’d just call it a flying thing. If they had no word for a lighter, they’d call it a fire thing. So you’d just keep talking of different classes of things. This is HTML 4 with DIV tags having different classes or IDs to differentiate them. If it gets worse some people might say a fire thing, others a firing thing and others a firy thing. Then one day some smart guys would decide it’s time to define a proper word for that thing and call it a lighter. Here comes HTML5.

          Of course on the other side, this semantic extension in HTML5 aims at sites having things like a header, a navigation, articles… This of course applies to many sites but probably not all. So in this case it’d be seen as a DSL for a subdomain of the domain addressed by HTML.

Leave a Reply

Your email address will not be published. Required fields are marked *