The Semantic Web

Friday, December 19, 2003 at 2:24 am | Comments off

I've come to the conclusion that very few people write semantic markup, or even know how to. Sadly, this is often true even when the designer is using a CSS based design. This is, I am sure, a result of the shoddy markup needed in legacy browsers. Many books claiming to teach web design/HTML do not even cover the issue of writing semantic markup, but instead revert back to the ugly void that is presentational markup. Markup has one purpose - to structure one's documents, and it is becoming much more important to realize this with XHTML.

So, what is semantic markup? Semantic markup is markup that is used to correctly convey the document structure. Heading tags (<h1> - <h6>) should be used to markup headings, not create "large, bold text." Paragraphs should be marked with <p> tags. <b> and <i> tags have little use, as they convey nothing structurally, but are rather used to create bold or italic text, respective - a task best left to CSS. A better alternative would be the <strong> and <em> tags, as they do convey structural meaning. If you have a <h2> without a <h1>, you probably did something wrong. Likewise, if you have multiple <br />s in a row, you probably did something wrong. Each tag that you use should have a semantic meaning - this should be kept in mind when one is structuring their documents.

Semantic markup has many benefits, including accessibility, search engine optimization, and page weight. Take a look at the source of a well structured site and you'll probably immediately notice it's clean, concise source code. Add to that the ability to pass XHTML as an application of XML, and you have yet another reason to use semantic markup.

Comments

Robert Wellock
December 19th, 2003
3:51 PM | #

Apart from the skipping heading levels (which is a separate ISO HTML issue) you are more or less spot on with the fact that many people don't understand how to use (x)html as specified via the Technical Recommendations.

(When will this system get a post preview option)

Ryan
December 19th, 2003
4:03 PM | #

I'm currently working on xBlogPro, which will allow users to preview their comments, as well as subscribe to email notifications when new comments are added to entries that they have responded to. Shortly after implementing this system, I realized that those were two very big weaknesses.

Chris
December 23rd, 2003
4:26 PM | #

can't wait for the pro version. I did a google search for xBlog and lots of people call there blog software xBlog, is there a reason for that?

maybe you should call it BrillBlog

Ryan
December 23rd, 2003
4:48 PM | #

Yes, I may try to come up with something unique. I did, or course, search the registered trademarks, to be sure it was not trademarked, and it was not. But at the same time, it would certainly help my search engine ranking if it were the only thing with a certain name. :)

Niket
January 13th, 2004
3:41 AM | #

I don't believe HTML is a semantic language. Too much is being made of semantic propriety. A lot has to improve before we get semantics right.

Another thing is that the web is a very different medium from others. For example you create pdf files for printing / publishing. If you were to do posters, you'd probably use Quark or something else. If you were to design, you'd use photoshop; animation - flash etc.

With internet, we are trying to do all that, publish, embed, print etc. There will be a number of cases when one semantic structure, if there is one, will fit all.

Probably, what we need is something like the following example. In this example, the subheading for body isn't necessarily h2. Is it more important than section headings
- Yes => <h2>
- No => h3???
------
<body>
<heading>Heading level 1</heading>
<subheading>Subheading level 1</subheading>

<section>
<heading>Heading level 2</heading>

<subsection>
<heading>Heading level 2</heading>
-----

Images should strictly be objects. Tables should also be objects (note that tables never run-in semantically... they are separate floats/objects in the document). That way, tables *cannot* be abused.

M. Jansen
March 23rd, 2005
7:09 PM | #

HTML is NOT a semantic language, neither is XML. It's semantics is a tree structure of the document. Attempts to describe semantics are well under way and several standards have already been defined by the W3C (http://www.w3c.org).

Languages like RDF(S) and OWL are based on XML but define a more rigourous layer which offers you real semantics. Check out the W3C links if you are interested.

Comments are automatically closed after 45 days