Google and the semantic web

Worth reading: article on Googles Peter Norvig and Tim Berners-Lee over the Semantic Web.

While Google is obviously right about “millions of web masters” who will have trouble adopting to these new standards - thats why we’re writing software.

The days when you were hand-writing HTML code will be over some day. Already now - as Google acknowledged - many people fail to write proper HTML. But HTML isn’t becoming easier to write. In fact, it gets much more complex with different character encodings, new CSS versions, raised bars for layouting, more dynamic web pages, Ajax, …

More and more people won’t write HTML themselves anymore, but use some software. People used to write bad HTML; now people use tools such as Wordpress which are at least expected to produce valid markup. They start using visual editors, which will eventually stop using tags such as and use the “more semantic” instead. Without actually being aware of that. And their blog software also does generate RSS for them. How many people have ever written RSS by hand?

So it’s mostly a matter of the tools we offer them; with better tools we can push the use of better (“semantic”) formats which then make data reuseable for others as well.

For example, tons of people hope that friendster, openbc, linkedin and all these will help them in one way or another to keep contact with some people, sell some products, find new jobs. What these web pages basically collect is FOAF data. (And, btw, if any of these web sites were true Web 2.0, they would actually export FOAF files via some API!) They have a UI people understand; now all they would need to do is share their data, and we’d have a large body of “semantic web ready” FOAF data.

Similar things apply to other “semantic” formats. Think calendards. Ical is pretty much the standard and widely used. Almost noone uses web pages which are only readable by humans.

The semantic web isn’t dead or anything. It just takes some time to be widely adopted, but that was to be expected. And having tools to generate semantic data that are maybe even easier to use than non-semantic tools - after all, the computer should be able to assist you more with semantic data - is the key thing to success.

I’m also looking forward to Semantic wikis such as IkeWiki, that try hard to make entering semantic data as easy as editing a non-semantic wiki page. In large wikis such as Wikipedia, making useful links is not as easy as typing [MagicWord], because you first have to look up the magic word. A semantic wiki can assist you by suggesting appropriate links based on the information you’ve already entered. (e.g. if you mark a page as biology, it won’t suggest you it might be a computer part).