More on XML

Gunnar, first of all, let me point out that repeating the closing tag makes the format more robust. If the parser encounters a non-matching closing tag it knows that something isn’t right. Thats a feature, not a bug.

IIRC SGML had shortened closing tags (</>) and closing tags were partially optional; if the parser encountered a closing tag it would assume all other tags closed up to the previous matching tag.

I guess some people who used SGML and were involved with XML had made the experience that this was a bad idea. It makes parsing harder and the file format more likely to break.

If XML would only be written by perfect tools, we wouldn’t need the verbose closing tags. We could probably use some binary syntax like “\1tagname\0\2contents\0”. Parsing would be fast and easy. Or something along

tag { "string contents" tag {} }

(which I like a lot). Or S-Expressions. But it is a strength of XML that you can detect certain common errors such as incorrectly nested tags: XML data is generated by broken tools all the time, so we need a syntax which allows us to detect such errors.

As for my RSS feed: the links are correct. The tool you are using mishandles the guid tag, which is just a unique identifier; the sole responsibility of my blog is to make them unique. Many tools use a link for the guid, but that is not required by the spec. In fact my blog is quite verbose about not using the full URL there: isPermaLink=”false”. Instead it provides the “link” element. Look at the source of my RSS, no repeated “en/” in there.

Planet is broken here, and your reader:

Your reader is broken, because it assumes that the guid tag is a link. It’s a unique identifier, a string, which often is a URL, but can be arbitrary. It should use the link element instead (which IS the correct URL)
Planet is broken, because of the way it constructs the guid field from my blog (granted, it does need a way to do that), especially losing the isPermaLink=”false” attribute.