Today, we had a conversation about HTML 4 vs XHTML 1.0. For me, the matter was neatly settled they very first time I saw an XML system produce XHTML like this:
An article with an empty emphasis tag.
Perfectly legal XML, perfectly legal XHTML. But — if you serve up this XHTML as
text/html (which 99.99% of the world does), then you end up with this:
Why? Because it’s parsed as HTML. And the browser sees the start of an em tag, but no close.
And now I make sure that all our sites emit HTML 4. It’s a lot simpler.
This isn’t to say I don’t use XHTML. It’s a fine medium for further processing (e.g. applying XSLT). But it’s not right for serving up to browsers verbatim.
6 replies on “To XHTML or not to XHTML?”
You can’t serve XHTML as
text/html. Whatever you serve as
text/htmlis HTML – malformed HTML that happens to look like XHTML, maybe, but it’s HTML nonetheless. The only way to serve XHTML is to serve it as
@Aristotle Pagaltzis It’s kind of vague for XHTML 1.0, IIRC. This is just an example of why it’s a really bad idea.
No, it’s not vague at all. If you serve it as
text/html, then browsers parse it as
text/html, no matter what doctype and syntax you may have used.
You might talk about the config and/or code changes you made to switch between XHTML and HTML 4. The header has already been covered (text/html vs. application/xhtml+xml, but presumably you changed other things, like perhaps a dialect choice in XSLT, etc.
Great blog BTW, I seem to keep running across entries when I’m doing Google searches, always a good read.
@Mark The actual switch is worth documenting actually. This was on a Cocoon site and we found that simply switching the serializer from xhtml to html wasn’t enough. We also had to add the following XSLT transform into the pipeline in order to strip out the XHTML namespace before serialization. I never did get to the bottom of why this was necessary.
@Mark Actually, I wasn’t talking about the header. I was talking about serving XHTML served up as text/html. This is what caused so many problems I switched to serving up HTML 4 instead.