The eXtreme Markup Languages conference got me thinking about the upsides and downsides of XML, XSLT, CSS and related standards.

When I try to take a step back and get an “outside” view, these standards all have bigger and smaller problems in their widespread adoption.

Let’s start with XML. XML is widely adopted, but still many compatibility problems arise that XML was actually designed to prevent. Let me give you an example: I was writing a java application for a mobile phone. To transmit data to the phone I decided to use XML, hoping that I run into less problems than by using own data formats. Still this was rather painful: the J2ME on the phones I tested on had very different capabilities - and most were lacking character set support. So I actually had to remove the charset specification from the XML file to get it working on all phones: one didn’t read the file when the encoding was UTF-8, the other failed when I used latin1.

Other problems with XML “support” I frequently run into are Java and PHP programmers not using a proper XML writer to generate the output (but println() statements, and of course not paying attention to proper character escaping…) and similar issues on the reader side. Buffers not capable to store the long strings in some XML attribute of a special case etc. - the list goes on and on.

One of the problems with XML is that almost no user of it has ever read the full specs, because they are just way too much for them. Neither do they bother to learn the XML writer APIs, since they’ve always been writing XML with their text editor, too…

XSLT (since very few people use XSL-FO, and XPath is somewhat “omnipresent” I rip out the transformations) has very different problems: very few people understand it. And even less are comfortable with writing it.

While in theory XSLT is a nice and easy language - and a very clean, declarative and functional language - why do the “users” have so many problems with it? My guess is, that the language is very clumsy. Compare writing XSLT code with code in other languages, either Java, Python, or to have a fairer comparision: Haskell. In any of these languages, the code written by the average programmer is a lot easier to understand than in XSLT. I’m not bitching about not being able to modify variables (it’s functional, and I’ve learned how to write code in functional languages as well as the benefits) but I blame much to the syntax of XSLT - and XML. I don’t want to give full examples here, but just look at the way you pass variables into templates. This really doesn’t increase readability.

Some of the issues with XSLT could be remedied by having a free, widely adopted editor (testware or shareware won’t do, you need to achieve an eclipse-like community status, so OpenSource is the way to go!) which somehow hides the syntax from the average programmer and doesn’t force you to either learn new shortcuts or use the mouse (which would make me continue using vim, since I’m way faster with it). Also, no user of XML should have to write XML with a normal text editor if we want to get rid of broken XML files… But maybe it would be easier to just make a new transformation language designed for the person writing the code, not for the parser…

My complaints about CSS (apart from Microsoft having serious bugs in their CSS support) are mostly that it’s like an alien language here. While it’s syntax is nice, compact and easy to read (in contrast to XSLT), it doesn’t fit together with the rest of the puzzle. What especially bites me are constructs such as

  a:before { content:"<b>"; }
  a:after  { content:"</b>"; }

Ouch! How is anyone expected to verify the resulting document?

Of course there are cases where you can use this very nicely, but you can also do very ugly things… :-(

(If you want to see a nice thing you can do with that, add the following line to your mozilla profiles chrome/userContent.css: a[href$=".pdf"]:after {font-size: 10px; color: red; content: " [PDF]";} which will add a red [PDF] after any link to a pdf file. Very handy.)

Well, enough rant for today. Especially since I don’t have a clear proposal on what to do. I’m not trying to say “you are doing it all wrong”, I’m only trying to point out what I see as the biggest issues.

Let the flamewars begin! (Sorry, my blog doesn’t allow comments or trackbacks by design - it’s intentionally read-only for the web server.)

[Update 08/17/05: Seo Sanghyeon pointed me to NiceXSL, a simplified syntax for XSLT, but he also says that it isn’t just the syntax that makes XSLT difficult for most users.]