Don’t call me DOM

8 July 2004

GRDDL-izing

Filed under:

While I wish there was a continuous effort behind GRDDL, so that we could get more Semantic Web harvesting tools supporting it, I’m still trying to push a little of GRDDL data here and there.

Who’s who at W3C?

The W3C Team is presented in a single page with links to home pages, email addresses, bios, etc. An obvious target for GRDDL-ization!

First step was to make it XHTML instead of HTML 4.01, since GRDDL is only designed to work with XHTML; that was a good occasion to clean up the mark up and use the strict DTD rather than the transitional one.

Then, I looked at the documentation of the experimental FOAF GRDDL-izer (handily available embedded in the XSLT itself), added two classes (foaf-person and foaf-name) and one rel (foaf-home) to the PHP code that generates the page, plus the profile attribute and the relevant link (<link rel="transformation" href="http://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokFOAF.xsl" />).

Finally, I check that everything works as expected in the GRDDL demonstrator, and sure enough, I’m getting back a FOAF description of the W3C Team!

The structure of the W3C People page doesn’t allow me to easily encode the fact that the different persons know each other; it might be that the proposed structure of the FOAF GRDDL-izer isn’t optimal.

Next step would be to create a GRDDL-harvester that would create a blogroll by harvesting personal home pages and looking for <link rel=”alternate” type=”application/rss+xml” />-linked RSS feeds (which themselves should maybe be automatically recognized as meaning foaf:weblog).

Other targets

Other formalized pages that would benefit from being GRDDL-ized:

  • the W3C Mailing List Archives; I’m not sure if this actually needs to hack on the underlying software (hypermail), but I suspect it doesn’t; there is a narrow window to target once we decide to regenerate our whole archives, which is probably going to happen once we integrate the patches Daigo is working on to make them properly encoded
  • the Technical Reports (not the list itself, since it’s already available in RDF); this one is a pretty low hanging fruit, since there is already an XSLT style sheet transforming TR documents in RDF, but the main issue is the social one: how to convince TR editors to include the relevant profile attribute? Pushing it in the pubrules is highly unlikely

In fact, both cases would benefit from being GRDDL-ized at the upper level, i.e. having an HTTP header pointing to the relevant transformations. HTTP headers are quite painful to standardize, though; too bad the RFC 2774 (an HTTP extension Framework) is still experimental.

2 Responses to “GRDDL-izing”

  1. Dave Raggett Says:

    If we could use a blogging tool for Team 2 minute reports in place of email, we could then harvest these for all kinds of interrelationships. A W3C blogging tool would give us the chance to do smart linking, perhaps exploiting NLP techniques for named entity recognition.

  2. Dominique Hazaël-Massieux Says:

    See also Ivan’s list of Team Contacts and DanC’s rollodex (both Team-only resources)

Picture of Dominique Hazael-MassieuxDominique Hazaël-Massieux (dom@w3.org) is part of the World Wide Web Consortium (W3C) Staff; his interests cover a number of Web technologies, as well as the usage of open source software in a distributed work environment.