Menu

Semantic HTML for Web Content

With a little bit of effort, we can make our markup more meaningful.

But why put in extra time and resources to implement semantic HTML? Most users don’t read our HTML. And they only care about what’s on the screen.

Semantic HTML is really just for machines. They aren’t as smart as you and I, so we need to help them out.

An example of machines that benefit from semantic HTML are search engines. When search engines index our site, they interpret the content of our web pages based on our markup.

This is what Google says about using semantic HTML (emphasis is mine):

Google (and other search engines) can use that data to index your content better, present it more prominently in search results, and surface it in new experiences like voice answers, maps, and Google Now. Promote Your Content with Structured Data Markup

Social media web services like Facebook, Pinterest, and Twitter love semantic markup. Especially when our users share our content on them. These web services take parts of our articles to display on their platform. If we use semantic HTML, they’ll be able to do a better job.

Language-translating tools examine our markup so they can convert our articles to another language. Good HTML markup can result in more accurate translations. For example, there are subtle distinctions between American-English and British-English. People might be able to understand dialectical and idiomatic differences with ease. But machines might not.

Semantic HTML also enhances web accessibility. Assistive technologies like screen-reading software parse and interpret our HTML. With semantic HTML, people with special needs will be able to read and navigate our articles easier.

That’s just the tip of the iceberg. There are a gazillion other machines that look at our HTML and try to understand it. Heck, the Internet’s made up of a bunch of machines. They’re a big part of the Web. We should try our best to feed them more meaningful data.

OK so, by now, I’m hoping you’re on board. Now you want to use semantic HTML. Maybe on your blog. Or in a CMS development project.

Check out the boilerplate below.

HTML Template

Here’s a semantic HTML template for web content. It’s a good starting point/boilerplate. Just fill in the blanks. It’s general enough so that it can work on many types of textual content. Blog posts, news articles, essays, and so on.

Update:  This template was changed due to the incorrect use of the main and summary HTML elements. See this comment below.

<!DOCTYPE html>
<html itemscope itemtype="https://schema.org/Article" lang="" dir="">
  <head>
    <title itemprop="name"></title>
    <meta itemprop="description" name="description" content="">
    <meta itemprop="author" name="author" content="">
  </head>
  <body>
    <article>
      <header>
        <h1 itemprop="headline"></h1>
        <time itemprop="datePublished" datetime=""></time>
        <p><a itemprop="author" href=""></a></p>
      </header>
      <div itemprop="about"></div>
      <div itemprop="articleBody">
        <!-- The main body of the article goes here -->
      </div>
      <footer>
        <time itemprop="dateModified" datetime=""></time>
        <section itemscope itemtype="http://schema.org/WebPage">
          <!-- Section heading-->
          <h2></h2>
          <p><a itemprop="relatedLink" href=""></a></p>
        </section>
      </footer>
    </article>
  </body>
</html>

The HTML markup template uses semantic HTML elements  (i.e. article, header, and footer).

Also, it uses structured data markup from Schema.org. Particularly the Article and WebPage schemas. Schema.org is a joint project by Google, Bing, and Yahoo!. A goal of the project is to provide a way for search engines to better understand our content.

Example

Here’s a filled-out example:

<!DOCTYPE html>
<html itemscope itemtype="https://schema.org/Article" lang="en" dir="ltr">
  <head>
    <title itemprop="name">Article's Web Page Title</title>
    <meta itemprop="description" name="description" content="Short description of the article.">
    <meta itemprop="author" name="author" content="Author Name">
  </head>
  <body>
    <article>
      <header>
        <h1 itemprop="headline">The Article's Headline</h1>
        <time itemprop="datePublished" datetime="1990-11-12">November 12, 1990</time>
        <p>By <a itemprop="author" href="#author-profile.html">Author Name</a></p>
      </header>
      <div itemprop="about">Summary of the article. This could be the lead, excerpt, abstract, or introductory paragraph.</div>
      <div itemprop="articleBody">
        <p>The main body of the article goes here.</p>
      </div>
      <footer>
        <p>This article was updated on
          <time itemprop="dateModified" datetime="2015-03-01">March 30, 2015</time></p>
        <section itemscope itemtype="http://schema.org/WebPage">
          <h2>Related Articles</h2>
          <p><a itemprop="relatedLink" href="#related-article.html">A Related Article</a></p>
          <p><a itemprop="relatedLink" href="#related-article-02.html">Another Related Article</a></p>
        </section>
      </footer>
    </article>
  </body>
</html>

View example

Details

Let’s talk about the various parts of the HTML template.

Specifying Content Type, Language and Text Direction

The html element has four attributes:

Semantic HTML Structure

To structure our content meaningfully, we use the following HTML elements according to their W3C specs.

Element Description
article The article element represents a discrete piece of content that can stand on its own. In the boilerplate, it houses all of our visible content.
header Introductory content is best structured as child elements of a header element. In the context of articles, introductory content can be the article’s headline, date of publication, and the author’s name.
footer “A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like.” — 4.3.8 The footer element

BBC lead summary.The BBC uses lead sentences on all of its articles.

Structured Data

The boilerplate uses microdata to reinforce our semantic HTML structure.

And, if you’re concerned about using the new HTML5 elements, then you can replace them with well-supported elements like divs and spans and still be able to provide semantic information with microdata.

Here are short descriptions of the microdata used in the HTML template.

Microdata Description
name This property points to the name of an item. In our case, the item is the article. The name of our article is the web page title, which is represented by the HTML title element. It’s common practice for web page titles to be unique (because of SEO), so the title element’s value is a good name for our articles in most cases.
headline The article’s human-readable title. Some sites use a short, keyword-rich value for the <title> element because of SEO, and then a longer headline that describes the subject of the article.
description A brief explanation of what the article is about. Assigning this property to the <meta name="description"> tag works well in most cases.
author The content creator’s name. In the HTML template, this is used in the <meta name="author"> tag and in the article’s visible content.
datePublished This property lets us explicitly state that the <time> element in the header contains the date in which the article was posted.
about This should be used on text that describes the subject of the article. It’s great for lead sentences/paragraphs.
articleBody This property represents the main body of the article.
dateModified You may want to let people know when the article was last reviewed and updated. If we want to give machines the same courtesy, we’ll need to use the dateModified property. This also gives web services a hint that they should update their index because the content has changed.
relatedLink This property is used for links related to the article.The relatedLink property is part of the WebPage schema so we have to state that item type in a parent element.

Using More Meaningful Markup

Like I said earlier, the HTML template is just a general starting point. Consider using additional microdata (and other semantic HTML elements) that will make your content more meaningful.

Schema.org has a ton of schemas for a wide variety of content types. Here are some examples:

See the full list of available schemas here.

Tips

Once you’re happy with your semantic HTML structure, test and validate it using Google’s Stuctured Data Testing Tool.

Structured data testing tool by Google.

Also, the HTML template uses HTML5 elements. If you support a lot of users who are on outdated browsers, you’ll need to use a shiv such as Modernizr or HTML5 Shiv. Or, you can replace the HTML5 elements in the template with generic elements such as div‘s. Keep the microdata though.

Jacob Gube is the founder of Six Revisions. He’s a front-end developer. Connect with him on Twitter and Facebook.

This was published on Apr 8, 2015

10 Comments

Steve Faulkner Apr 08 2015

“The HTML markup template uses semantic HTML elements such as article, header, and main.”

It does not use summary or main correctly:
main
Contexts in which this element can be used:
Where flow content is expected, but with no article, aside, footer, header or nav element ancestors.
http://www.w3.org/TR/html51/semantics.html#the-main-element
summary
Contexts in which this element can be used:
As the first child of a details element.
http://www.w3.org/TR/html51/semantics.html#the-summary-element

    Jacob Gube Apr 08 2015

    Thank you for pointing out those egregious errors Steve.

    I have changed the main and summary elements to divs.

    Also, I changed the aside element to a footer because it’s more representative of related links and the date when the article was last updated.

    I also placed an empty h2 element to remind people to include a heading element within the section element.

    Thoughts? Anything else I can do to improve this template?

Steve Faulkner Apr 09 2015

Hi Jacob, for the single article code example I would suggest http://codepen.io/stevef/pen/KwjdWB
There is no need to use an article element in this case, it is redundant, the whole document is the article as you have designated in your use of itemtype=”https://schema.org/Article” on the HTML element. In this way use the main element , as intended, to mark up the main content of the document. In your current example due to the (theoretical outline algorithm the document itself has no heading (the h1 is scoped to the article). Note I have also suggested you place the summary div in the header so all content is contained within a structural landmark (for AT users to navigate).

Steve Faulkner Apr 09 2015

Also suggest taking a look at HTML5 bones http://www.html5bones.com/

Teelah Apr 09 2015

This is a great example of semantic HTML. I mean, just add some styling and you will be good, perhaps a sidebar would be a nice add on.

Alex Moschopoulos Apr 10 2015

Great stuff, and I will definitely make use of it.

I have to ask though, wasn’t the time tag not put into the HTML spec? I remember learning it at the beginnings of HTML5 but then later saw it not accepted by the W3C.

    Jacob Gube Apr 10 2015

    Short answer: The time element is a part of W3C HTML5 specs. Use it.

    Longer answer: The time element is part of the W3C Recommendation of HTML5 standards and specifications. That’s the final and complete phase of the standardization process. The time element’s specifications can be found in section 4.5 Text-level semantics of the HTML5 vocabulary and associated APIs for HTML and XHTML.

    Concerned about future-proofing? The time element is also part of the HTML5.1 W3C Working Draft listed in: 4.5.16 The time element.

Cathy Mayhue Apr 11 2015

Its obvious, a webpage created through more informative semantic tag would make search engine bots understand better the context and the content of the website and hence would be indexed and ranked better.

    Jacob Gube Apr 13 2015

    Honestly, I’m not current with the state of RDFa. So thanks for sharing that link Marius. I’ll have to catch myself up ASAP. It’s a great structured data alternative to Schema.org. And RDFa is native to W3C, which might possibly mean a better symbiotic relationship.

This comment section is closed. Please contact us if you have important new information about this post.

Partners