Why the Semantic Web?

July 20th, 2007

Tim Berners Lee, the guy who invented the Web, said that its power lay in its universality. The Semantic ideal tries to achieve that universality.

When the web finally got big in the late 90s there was an explosion of personal “Look at Me” web sites all using HTML, or HyperText Mark-up Language.

Actually HTML wasn’t a programming language like FORTRAN or Basic which get a computer to do things like add up a series of numbers or divide a constant by the square root of Pi. It was just a series of instructions about how a character should look on screen: be bold, be REALLY BIG, etc. There were also commands to arrange things in rows and columns to tabulate data and so forth.

As a result, the list of codes was rather small: They included H1 to H5 to denote some sort of “headline” thingy (they got smaller as they went down), a paragraph mark (P) to split the text into blocks and a Block Return code (BR) to break a line. The page also had to have several “behind the scenes” HTML codes to tell the browser (probably Netscape) that this was a web page: there was HTML itself which identified the language, HEAD which contained all the document information, TITLE which showed the name of the page in the browser top bar and BODY which contained all the stuff the visitor was meant to see.

Many ignored this back-end code and just stuck up text with a few P’s and H’s and hoped for the best: it was eye-catching stuff. But it all worked, after a fashion.

Yet others demanded control, order, uniformity. And so along came HTML 2, complete with FONT tags and SPAN tags and the almighty DIV. Now really dedicated geeks positioned their content to the nearest pixel using nested table after nested table. One design program (NetObjects Fusion) used tabulation so complex, it refused to allow the user access to the HTML for fear of screwing it up.

Microsoft’s new Internet Explorer brought another complication in custom HTML: what worked one way in IE often didn’t work in anything else. And then there were new ways of surfing, like Web TV. Pretty soon, you needed a whole heap of versions of every page to make sure it could be seen by all. To get round this some clever dickies even used bits of new-fangled JavaScript to stop their pages showing up in one browser or another.

So, while the web began to look slicker, it became more partisan and fragmented. Clever people realised that if it went on like this the future of the internet would be one of hundreds of different versions of every page and all sorts of bodges, kludges and script workarounds. The answer was to separate content from presentation.

The one constant was “content” of the page — the text and images. The problem was how to display that content. With a Cascading Style Sheet (CSS) you could do just that: create a document full of words with basic instructions like: “That bit is the headline” and “That’s the text”, and then get something external to tell the browser HOW each should be treated.

HTML 3 consisted of a rather small list of codes: They included h1 to h5 to denote some sort of “headline” thingy (they got smaller as they went down), a paragraph mark (p) to split the text into blocks and a Block Return code (br) to break a line. The page also had to have several “behind the scenes” HTML codes to tell the browser (probably Netscape) that this was a web page: there was html itself which identified the language, head which contained all the document information, title which showed the name of the page in the browser top bar and body which contained all the stuff the visitor was meant to see.

Yet it also took on board some of the other tags which had come along in the intervening years and discarded — or deprecated — some others.

The style sheet’s job was made harder by the fact that the browser now used by most people seemed to be based on a HTML standard known only to Microsoft and so it was common to find the words: “Works best with Internet Explorer”. How different to our experience today, eh?

And that is how the Semantic Web came to pass. Today, all websites are based on the most recent evolution of the Hypertext Protocol: XHTML 5.0, and they all use the latest version of Cascading Style Sheets (CSS3) which are set-up to show the content of the page in the best way for the platform, be it Internet Explorer or Firefox or Play Station 6 or cell phone or laser egg cup.

The Semantic web finally realises Berners Lee’s dream: a truly portable internet. It’s important because it means that your content can been seen by the most number of people. Obviously, there will be some obvious differences between platforms, if only because of relative screen sizes; however, the designer and the content manager can be assured that their work will at least be intelligible to all, including those using special aids for disability, such as screen readers or custom style sheets. And that means you have the largest possible audience and the largest possible revenue potential.

Go semantic! You know it makes sense.



Leave a Reply