Archive for July, 2007

If It’s Broke, Fix It!

Monday, July 30th, 2007

Sometimes you could almost believe in gremlins, the evil little creatures Second World War airmen used to blame for errors which cropped up inexplicably from time to time.

Every so often something breaks on a web site like JWC. It happened again “recently”: I can’t be more specific because I only noticed it this morning when I went to use one of the site content tools, the Unicode Converter: type in characters and it converts them to their unicode, hexadecimal or decimal NCR equivalent.

Only it didn’t. Type in something and the only conversion it made was to Unicode UTF-8. There were no error numbers, no warning messages. It just didn’t work. I spent an hour poring over the PHP but couldn’t find the error. Eventually, I went back to square one and copied the code afresh from sceneonthe.net, the web partnership where we originally developed it.

The content tool is back but I’m still none the wiser as to how it went wrong.

Why You Should

Broken pages are a content manager’s nightmare. They may be the result of a programming error or code innocently altered by authorised site admins, or they may be the result of something more sinister. Then there are dead hot links caused by unexpected updates and the rest.

Errors are bad for SEO; they imply lack of attention to detail or even lack of updates and the least updated sites are dead ones. So part of your routine as content manager must be in finding — and fixing — broken stuff, perhaps with a link-check program or browser plug-in or indeed some outside assistance.

Users of TIME’s web site were always pointing out broken links, or worse, broken code (there is a kind of elation when pointing out errors to organisations who should know better).

But don’t shy away from such feedback and encourage it: ask your visitors to report any errors they see. And remember that someone who has never made a mistake has never made anything.

Turning the Tables

Sunday, July 29th, 2007

Repeat after me: “Tables are evil, and must be destroyed”. Now, lest you think I’ve got a vendetta going here against IKEA, let me assure you that the tables I’m talking about are those first defined in 1994’s guidelines for HTML 2.0 as a means of displaying data content in an ordered and helpful manner.

Of course it was all downhill from there. Some clever dicky worked out that tables could be used to display complete pages in an ordered manner. (Note I did not say helpful.)

By the birth of HTML 3.2 almost all websites were infested with tables, nesting content to within a pixel of their existence. I’m just as guilty as anyone of thinking in a completely obsessive-complusive way that tables were the only way to fly.

Today, in the era of XHTML and CSS and the accessible Sematic ideal for content, we all know better. Tables are evil, and must be destroyed!

But hang on. The introduction of tables in 1994 was for a reason. There are times when we need to display data content in an ordered and helpful manner. So, as a content expert, how do you do tables in an accessible way?

Read the rest of this entry >

Getting Started

Wednesday, July 25th, 2007

How do you solve a problem like KoreaSix MadIn Britain, newspapers will design whole pages around a killer headline: indeed, a very thin story can be sold by a few well-placed words, especially if a big “sexy” picture is thrown in to the mix. On the internet, which is generally no friend to big “sexy” pictures, a killer headline is even more important for good SEO-friendly content, backed up of course by a homicidal first paragraph.

You’d be wise to spend at least half your production time on this important word partnership. It is, after all, a fine juggling act. On one hand, you must grab your reader’s attention (and factor in lots of keywords to grab the search engines’ attention, too); on the other, you don’t want to overload anyone’s brain.

In reality, even the most interesting words on the web seldom hold peoples’ attention beyond two or three paragraphs, and thereby lies the killer intro’s importance. Get it right and you’ll have passed on the content you need to before the reader drops off; you might even have grabbed their attention enough to make them read on.

So how do you grab that attention? Try these …

  • Start with a quote, best of all some funny words from somebody famous. It gets straight to the point and establishes a common frame of reference with the reader.
  • Quote a relevant, surprising statistic, and make sure it’s as clear as possible. New information is naturally interesting and will draw the reader in.
  • Draw a word picture in the reader’s mind. Invoking their imagination gets them involved in the content from the get go.
  • Ask a question (preferably one which your content will answer). Questions — even ones that need no answer — make people think and lead them to seek solutions.
  • Tell them a story. Storytelling is something humans do instinctively, and listening to stories is just as natural.

Yes, You Can Write!

Tuesday, July 24th, 2007

People often say: “There’s no point in me writing content, I can’t write.” Yet, paradoxically, these same individuals are often the most fluent people I’ve ever met, especially when it comes to their specialist subject.

I’ve recently been advising a mate on ways to increase his site traffic and suggested a blog. “I’m not sure how interested people will be,” he said. “I’ve never read a blog nor has anyone mentioned one to me but I do often see a small surge in sales when the site gets talked about on forums.”

I told him the best blogs were those with real opinions and real information. And it didn’t need to be groundbreaking content either: one of the web’s biggest problems is that most of us don’t have the time (or the inclination) to wade through the waffle to find the nuggets of fact.

So here is some handy advice for would-be content writers

Write From The Heart

Use the sort of words you’d say to a friend who shared your passion but beware of jargon in content

Keep It Short

Write it in 250 words or less: any shorter and your reader will think: “Why did he bother?” Any longer and he’ll ask himself: “Can I be bothered?” .

Keep It Relevant

If your chosen keyword density is less than 3%, consider writing another posting.

Don’t Be Afraid to be an Explainer

If content can be better put, put it. Stephen Hawking wrote a guide to his A Brief History of Time because he realised it went over the heads of most people.

Don’t Be An Impulsive Publisher

After you’re written what you’ve written, read it again – TWICE. And give yourself a break in between, you’ll be surprised how many content errors will reveal themselves after a short rest.

Practice Really Does Make Perfect

Even Shakespeare started somewhere. Writing — like sex — gets better the more you practice IF you’re willing to be self-critical. Allow your writing to be less than perfect from the start.

Get Active in Content

Monday, July 23rd, 2007

Internet content’s primary use is as a sales tool and when selling something it’s always best to use an active voice.

Compare these two sentences:

“The cost per square foot was said to be uneconomic by one in three of businesses questioned in the survey”

and

“One in three businesses questioned in the survey said the cost per square foot was uneconomic”

The second is written in the active voice: it sells the content in a more positive way. That content could be sold more effectively still:

“A third of businesses say the cost per square foot is uneconomic, a survey shows”

As well as being more readable, the active-voice content now has the benefit of being much shorter, especially as the brain subliminally discards the last clause (“a survey shows”).

The comparitive brevity of the active voice is attractive to anyone trying to produce SEO-friendly content, the Golden Rule of SEO-friendly content being: “Keep it to 250 words or less!”

When writing sentences of content in the active voice, the subject (“a third of businesses”) performs the action expressed in the verb (“say”) — the subject acts. In the passive voice, the subject receives the action expressed in the verb — the subject is acted upon.

You can spot passive voice in content by the overuse of “be” — or one of its forms (“am”, “are”, “been”, “is”, “was” or “were”) — or the phrase “by the…” after the verb. Passive-voiced content is usually stuffed with prepositionsso-called Lazy Words — which can make it dull and lifeless: not the impression you’d want a salesman to give.

Remember too that most content never gets read. Any stats package will show most page views last seconds, which equates to a glance before moving on. The active voice gets content across from the start and may even help extend those seconds into minutes … or a bookmark.

Why the Semantic Web?

Friday, July 20th, 2007

Tim Berners Lee, the guy who invented the Web, said that its power lay in its universality. The Semantic ideal tries to achieve that universality.

When the web finally got big in the late 90s there was an explosion of personal “Look at Me” web sites all using HTML, or HyperText Mark-up Language.

Actually HTML wasn’t a programming language like FORTRAN or Basic which get a computer to do things like add up a series of numbers or divide a constant by the square root of Pi. It was just a series of instructions about how a character should look on screen: be bold, be REALLY BIG, etc. There were also commands to arrange things in rows and columns to tabulate data and so forth.

As a result, the list of codes was rather small: They included H1 to H5 to denote some sort of “headline” thingy (they got smaller as they went down), a paragraph mark (P) to split the text into blocks and a Block Return code (BR) to break a line. The page also had to have several “behind the scenes” HTML codes to tell the browser (probably Netscape) that this was a web page: there was HTML itself which identified the language, HEAD which contained all the document information, TITLE which showed the name of the page in the browser top bar and BODY which contained all the stuff the visitor was meant to see.

Many ignored this back-end code and just stuck up text with a few P’s and H’s and hoped for the best: it was eye-catching stuff. But it all worked, after a fashion.

Yet others demanded control, order, uniformity. And so along came HTML 2, complete with FONT tags and SPAN tags and the almighty DIV. Now really dedicated geeks positioned their content to the nearest pixel using nested table after nested table. One design program (NetObjects Fusion) used tabulation so complex, it refused to allow the user access to the HTML for fear of screwing it up.

Microsoft’s new Internet Explorer brought another complication in custom HTML: what worked one way in IE often didn’t work in anything else. And then there were new ways of surfing, like Web TV. Pretty soon, you needed a whole heap of versions of every page to make sure it could be seen by all. To get round this some clever dickies even used bits of new-fangled JavaScript to stop their pages showing up in one browser or another.

So, while the web began to look slicker, it became more partisan and fragmented. Clever people realised that if it went on like this the future of the internet would be one of hundreds of different versions of every page and all sorts of bodges, kludges and script workarounds. The answer was to separate content from presentation.

The one constant was “content” of the page — the text and images. The problem was how to display that content. With a Cascading Style Sheet (CSS) you could do just that: create a document full of words with basic instructions like: “That bit is the headline” and “That’s the text”, and then get something external to tell the browser HOW each should be treated.

HTML 3 consisted of a rather small list of codes: They included h1 to h5 to denote some sort of “headline” thingy (they got smaller as they went down), a paragraph mark (p) to split the text into blocks and a Block Return code (br) to break a line. The page also had to have several “behind the scenes” HTML codes to tell the browser (probably Netscape) that this was a web page: there was html itself which identified the language, head which contained all the document information, title which showed the name of the page in the browser top bar and body which contained all the stuff the visitor was meant to see.

Yet it also took on board some of the other tags which had come along in the intervening years and discarded — or deprecated — some others.

The style sheet’s job was made harder by the fact that the browser now used by most people seemed to be based on a HTML standard known only to Microsoft and so it was common to find the words: “Works best with Internet Explorer”. How different to our experience today, eh?

And that is how the Semantic Web came to pass. Today, all websites are based on the most recent evolution of the Hypertext Protocol: XHTML 5.0, and they all use the latest version of Cascading Style Sheets (CSS3) which are set-up to show the content of the page in the best way for the platform, be it Internet Explorer or Firefox or Play Station 6 or cell phone or laser egg cup.

The Semantic web finally realises Berners Lee’s dream: a truly portable internet. It’s important because it means that your content can been seen by the most number of people. Obviously, there will be some obvious differences between platforms, if only because of relative screen sizes; however, the designer and the content manager can be assured that their work will at least be intelligible to all, including those using special aids for disability, such as screen readers or custom style sheets. And that means you have the largest possible audience and the largest possible revenue potential.

Go semantic! You know it makes sense.

Accessibility v Usability

Wednesday, July 18th, 2007

useit.com, or not?Some might argue that to have both accessibility and usability is a case of having your cake and eating it too. Look at useit.com, usability guru Jakob Nielsen’s site, and you might agree.

Nielsen seems to have content licked: aside from permanent content in the form of usability articles, he also carries links to outside articles on usability news. There is real expertise here: there are virtually no images because, he says, “since most users have access speeds on the order of 28.8 kbps, Web pages can be no more than 3 Kb if they are to download in one second which is the required response time for hypertext navigation”; and his use of fonts (sans-serif) and text sizes (large) ticks all usability boxes.

But (and you knew a “but” was coming, didn’t you?) the bad news for content accessibility and usability isnt hard to find.

Aside from the fluid layout — making content stretch across the browser in a most unreadable way — it is built with incomplete tables and deprecated tags. And although it does include a doctype, it’s the dead TRANSITIONAL.

Discard the CSS and you won’t notice much change because of the tables: making the content almost unusable for disabled surfers. It seems while Nielsen has optimised (not brilliantly) for usability of content, he’s given scant thought to accessibility of content.

And, just a personal comment here, isn’t it just plain UGLY? How long would the casual surfer stick around to read the content?

So are usability and accessibility like chalk and cheese? Not really. To see content optimised for usability and accessibility just go to wordpress.org — semantic XHTML, big fonts, few images, obvious links, good colors, tight CSS, etc,.

A content manager cannot afford to lose sight of either usability OR accessibilty: to do so alienates potential content users. And, as Wordpress demonstrates, you can have your cake and eat it too.

The End of Transitional XHTML

Monday, July 16th, 2007

You may not realise this, but if you’re still putting …

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

in the header of your HTML files then you’re bang out of order.

The Code Recommendations laid down by the World Wide Web Consortium (or W3C) in Dec 1999 declared an end to the constricting HTML 3 standard — the one that used all those <font> tags and stuff — in favour of HTML 4.01 which was based on the Semantic Mark-Up principle. In other words, HTML would now be portable across different environments and ready for the environments still to come.

Of course, the folks at the W3C are realists. They knew that there would be obstacles to overcome before the fully Semantic web became a reality: old equipment, die-hard users, MicroSoft, etc. and so they made allowances in 4.01, deprecating the old HTML 3 Visual Formatting code but allowing HTML4.01 to have two doctypes — Strict (the semantic ideal) and Transitional (the buggy, half-way house, darn ugly reality).

And they introduced XHTML, similar to HTML 4, but refined to operate with XML or eXtensible Mark-up Language, with a stipulation that XHTML 1.1 — the accessible coder’s Nirvana — would be in place as the only way to fly by April 2001.

So that’s it. Any webpage created since April 2001 (and in fact any webpage around today) should be in XHTML 1.1 STRICT! The transitional type died more than SIX YEARS AGO!

Private hobby coders may be forgiven for having let this pass them by: most hobby coding software, if it includes doctypes at all, usually opts for TRANSITIONAL as standard.

But professional web designers are a different case and yet new websites appear everyday coded by would-be professionals either using the TRANSITIONAL doctype or no doctype at all.

“Big deal!” I hear you cry. “Get a life, loser!” you continue.

I think you’re missing the point. For the internet and its applications to work seamlessly — and that means so that YOUR pages can be seen by the biggest number of people — XHTML 1.1 Strict (and its successors) must be used. Not only does that mean that it can be understood by the biggest number of browsers across the biggest number of platforms, it also means it is accessible to members of one of the biggest audiences around and the one whose numbers can only grow: the less-able and the less-young. That’s not just important from a moral standpoint, in markets like Europe and elsewhere it’s a legal REQUIREMENT!

And a final word for those claiming that the STRICT doctype locks out users of other non-standard, aging browsers. I made an interesting discovery while recoding the TIME magazine international websites a while back. We had to have the sites run on Internet Explorer 5 on a Mac running OS9 because that was the set-up that most of our offices used. Try as I might using the Transitional doctype, I could not get this arcane botch job to work. Then I tried Strict and to my surprise, everything clicked into place. It seems that the Mircosoft Mac boys (and girls) had the the Strict doctype in mind when they wrote their crazy piece of s**tware!

By the way. I may sound like a total geek but really I’m not. I’ve even seen a naked lady!

Plagiarism and how to avoid it

Thursday, July 12th, 2007

Writing this is like walking through a minefield. In talking about plagiarism and quoting other people, I too run the risk of plagiarism.

Students can screw up their whole academic career if found guilty of plagiarism, but the dangers for online content are real.

The simplest way to avoid plagiarism is: let your reader know you’ve used other people’s words!

Always credit

  • Quotations
  • Facts that aren’t common knowledge
  • Ideas
  • Opinions

Strictly speaking, it’s also plagiarism if you don’t credit paraphrased and summarized text, but there’s an obvious paradox: for most of us, all we say and all we write is a result of things said to us and read by us. On that basis, all writing is paraphrase and summary. So does this mean you can copy text, correct grammar and spelling, and call it your own? Nope, that’s plagiarism.

Re-writing content to avoid plagiarism means changing both text and structure.

In Britain, tabloid newspapers employ “re-write subs” to take raw content and produce stories which match house style, readership, and the space available. The late lamented tabloid editor Mike Gabbert (who I worked under in the 1980s) used to say that, compared to broadsheets which had space to print everything, tabloid content was the best-written. Incidentally, he said this meant the broadsheet Telegraph had room to print more titillating and grisly court case details than the tabloid Sun or Mirror. And did.

Most re-write content retains the author’s by-line. All that’s left for the sub-editor to do is dodge reporter complaints their story has been “ruined”. Subs call this ruination “improvement”.

Much plagiarism “creeps” in during research, so I offer three ways to avoid becoming a “dirty copycat”. (Incidentally, you’ll find this ALL OVER the internet and I learned it before the web existed so it’s common knowledge.)

When copying a source:

  • Put text in quote marks to make it clear it’s not YOUR words
  • Record details of the content’s origin: URL, book title, etc.
  • If you have an original thought about the text, make a note of it. Otherwise, credit the author.

Plagiarism aside, citing other people’s observations benefits your content by turning opinion into argument.

If you’re concerned that your work may itself have been plagiarised  you can check it at copyscape.com

Reviewing Content

Tuesday, July 10th, 2007

Everyone likes good publicity: just look at the film posters quoting reviews of the movie. Of course, these can be misleading. A recent blurb for the TV movie The Girl in the Café quoted The Oregonian newspaper:

“An endearing romantic comedy.”

However, what The Oregonian actually said was:

“This new offering from HBO Films is at its heart a bit of political propaganda wrapped into an endearing romantic comedy that starts losing its laughs when it gets to Reykjavik and decides its teachable moment has arrived.”

You can see more of these at Gelf Magazine.

A company like the one I work for doesn’t need to be so economical with the actualité: these days Regus is well used to good news stories so one of the content ideas I’ve been examining is a gallery of what professional news organisations say about Regus, a sort of online scrapbook.

Of course content usually comes with a price, and as Regus discovered recently with a CNBC video clip of or CEO, Mark Dixon, simply linking to a video or article is not enough to guarantee of page views and good-linking SEO. Within hours of linking to the clip on CNBC, they imposed a subscription-only tag. There’s nothing suspicious here: in a charged-content model, content is usually free for a set period before the curtain comes down.

News providers like magazines, newspapers and broadcast media have been arguing for years about charging for content (news agencies like Reuters and the Press Association survive by charging for content, but their main customers are the aforesaid magazines, newspapers and broadcast media). Yet because the enduring ethos of the Internet is still “everything is free”, it’s been very difficult to get drive-by surfers to pay for anything. Some have tried. Most have failed. My former employers, TIME magazine, charged for content from their printed magazine for four years and made a profit — the only part of the operation that did!! — but when their sister site AOL decided to open up their content to everyone, TIME decided to drop the “curtain” — and still made a profit!

Incidentally, TIME used to charge for magazine content seven days after it hit the newsstands on the basis that people would pay for a magazine online which they couldn’t buy on the streets. The exception to this rule was religious stories — which they charged for immediately: religious feeling as it is in the U.S., readers would pay for that content at any time. Religious stories were the big money earners!

The experience at Regus is that a “What They Say About Us” page runs into the buffers because some of the content linked to will be off limits to non-payers (or at least non-subscribers), and to ensure that dead links are kept to a minimum — an absolute must for SEO — requires a high-maintenance solution: checking all links all the time.