Don’t call me DOM

16 June 2009

Validating XHTML Basic 1.1

As I was trying to validate a large number of XHTML MP 1.2 files (the ones in one of the OMA’s XHTML MP test suite – whose welcome page itself ironically is not well-formed), I realized that the tool I was using, based on the WDG HTML validator as packaged by Debian, was making network access requests when used on these XHTML MP 1.2 files. I then moved to use xmllint with the --valid option, but it had the same behavior.

As one of the W3C Systems team member, I’m acutely aware how bad is the practice of fetching DTDs over the network over and over again.

Furthermore, the OMA XHTML MP 1.2 DTDs are broken – they include non-UTF-8 characters in the comments of one of the modules (bug that I have reported a while ago but that still hasn’t been fixed apparently).

But given that XHTML MP 1.2 is mostly equivalent to XHTML Basic 1.1, I thought I would validate the said files against that DTD – but I wanted to make sure I wouldn’t hit the network when doing so.

Unfortunately, the XHTML Basic 1.1 DTD was not installed in my local XML catalog by default as part of the others DTD in the w3c-dtd-xhtml package – I’ve filed a bug report in the hope that it will be in the future, along with the XHMTL+RDFa DTD.

So I looked into adding the XHTML Basic 1.1 DTD to my local XML catalog, and marking it as an equivalent of the XHTML MP 1.2 DTD at the same time. Given that this wasn’t exactly straightforward, I thought I would document here what I did to set that up on my Ubuntu Jaunty install, in the event that someone else would need to do something simiar:

  • first, I added the following lines to my /etc/xml/catalog file:
    <delegatePublic publicIdStartString="-//W3C//DTD XHTML Basic 1.1" catalog="file:///etc/xml/w3c-dtd-xhtml.xml"/>
    <!-- Making XHTML MP an equivalent of XHTML Basic 1.1 -->
    <delegatePublic publicIdStartString="-//OMA//DTD XHTML Mobile 1.2" catalog="file:///etc/xml/w3c-dtd-xhtml.xml"/>
  • I edited /etc/xml/w3c-dtd-xhtml.xml to add the following lines:
    <delegatePublic publicIdStartString="-//W3C//DTD XHTML Basic 1.1//EN" catalog="file:///usr/share/xml/xhtml/schema/dtd/basic11/catalog.xml"/>
    <!-- Making XHTML MP an equivalent of XHTML Basic 1.1 -->
    <delegatePublic publicIdStartString="-//OMA//DTD XHTML Mobile 1.2//EN" catalog="file:///usr/share/xml/xhtml/schema/dtd/basic11/catalog.xml"/>
    
  • I created the directory /usr/share/xml/xhtml/schema/dtd/basic11/ and put the following files in it (available as a Zip file):
    • the SGML Catalog definition of XHTML Basic 1.1 – since I’m now using xmllint rather than the WDG validator, I’m not sure if it works or is useful as is
    • a modified XML Catalog file that points both the XHTML Basic 1.1 FPI (-//W3C//DTD XHTML Basic 1.1//EN) and the XHMTL MP 1.2 FPI (-//OMA//DTD XHTML Mobile 1.2//EN) to the XHTML Basic 1.1 DTD
    • a corrected version of the flat DTD for XHTML Basic 1.1, that includes all the necessary modules as a single file – while it was based on an old version developed by the XHTML Working Group, I had to update it quite a bit to make it actually represent what the XHTML Basic 1.1 spec says
    • I have also included other files that I had found in the similar catalog directory for XHTML Basic 1.0, but I think they’re only useful for SGML-based validation, rather than XML-based – they are also in the Zip file, but may not be useful

    With these changes, I’m now able to validate my XHTML MP/Basic files without hitting the network. But the main lesson for me remains that it isn’t exactly trivial to add DTDs to a catalog when it isn’t done by people who actually know what they’re doing…

Picture of Dominique Hazael-MassieuxDominique Hazaël-Massieux (dom@w3.org) is part of the World Wide Web Consortium (W3C) Staff; his interests cover a number of Web technologies, as well as the usage of open source software in a distributed work environment.