To XHTML or not to XHTML?

Today, we had a conversation about HTML 4 vs XHTML 1.0. For me, the matter was neatly settled they very first time I saw an XML system produce XHTML like this:

  <p>An article with an <em/> empty emphasis tag.</p>

Perfectly legal XML, perfectly legal XHTML. But — if you serve up this XHTML as text/html (which 99.99% of the world does), then you end up with this:

Empty tags considered harmful

Why? Because it’s parsed as HTML. And the browser sees the start of an em tag, but no close.

And now I make sure that all our sites emit HTML 4. It’s a lot simpler.

This isn’t to say I don’t use XHTML. It’s a fine medium for further processing (e.g. applying XSLT). But it’s not right for serving up to browsers verbatim.

6 Comments to To XHTML or not to XHTML?

  1. dom says:

    @Mark Actually, I wasn’t talking about the header. I was talking about serving XHTML served up as text/html. This is what caused so many problems I switched to serving up HTML 4 instead.

  2. dom says:

    @Mark The actual switch is worth documenting actually. This was on a Cocoon site and we found that simply switching the serializer from xhtml to html wasn’t enough. We also had to add the following XSLT transform into the pipeline in order to strip out the XHTML namespace before serialization. I never did get to the bottom of why this was necessary.

    <xsl:stylesheet xmlns:xsl="" version="1.0">
      <xsl:template match="*">
        <xsl:element name="{local-name()}">
          <xsl:apply-templates select="@*|node()" />
      <xsl:template match="@*">
        <xsl:attribute name="{local-name()}">
          <xsl:value-of select="." />
      <xsl:template match="processing-instruction()|comment()">
          <xsl:apply-templates select="node()" />
  3. Mark says:

    You might talk about the config and/or code changes you made to switch between XHTML and HTML 4. The header has already been covered (text/html vs. application/xhtml+xml, but presumably you changed other things, like perhaps a dialect choice in XSLT, etc.

    Great blog BTW, I seem to keep running across entries when I’m doing Google searches, always a good read.

  4. No, it’s not vague at all. If you serve it as text/html, then browsers parse it as text/html, no matter what doctype and syntax you may have used.

  5. dom says:

    @Aristotle Pagaltzis It’s kind of vague for XHTML 1.0, IIRC. This is just an example of why it’s a really bad idea.

  6. You can’t serve XHTML as text/html. Whatever you serve as text/html is HTML – malformed HTML that happens to look like XHTML, maybe, but it’s HTML nonetheless. The only way to serve XHTML is to serve it as application/xhtml+xml.