2008-08-26

The Super Information Archipelago

Posted in Philosophy of Markup at 6:56 pm by Liam Quin

I spent some time recently thinking about RDF and the Semantic Web, and about ontologies, and about XML and topic maps and XML schema languages.

The Semantic Web people are doing something (or trying to do something) more global than I recall from the original RDF meetings; they are trying to make a global, shared metadata space. A space where, it sometimes seems, there is neither privacy nor attribution, where anyone can say anything and no-one knows who to believe. That’s more than a little inaccurate, but it seems that way.
A consequence of trying to make a global space is that there are certain kinds of thing you can’t say. Just as in the real, physical world I can’t say that there are no green-headed aliens, or that there are no people with feet instead of hands, since no exhaustive search is possible, so there are things one cannot say in this open-ended semantic world.

Unfortunately, they often seem to be things I want to say, albeit with smaller scope. I don’t want to say there are no colour photographs of Stokesay castle anywhere in the world, but I know that I don’t have any on my own Web site. I can say, I have some black-and-white pictures of the castle. (Digital purists want me to say that the images are monochrome, but people were talking about black-and-white photography long before computers were invented, and the words continue to work very nicely, thank you.)

In the XML world, we generally have somewhat lower sights. Rather than trying to make single vocabularies that work for everyone, everywhere, always, we give people the tools to build their own vocabularies. There are advantages and disadvantages to this. It is expensive, but the result, like a good suit, is a good fit and looks it. It isn’t an off-the-peg one-size-fits-all suit. (don’t forget to buy some classy socks to go with it).

In other words, we in the XML world tend to live on islands, linked by bridges. Or possibly linked by ferries, or sometimes you just swim across.

The biggest peril of this civilisation of heterogeneous annotation is that it is hard for anyone to search across multiple vocabularies. One group of people distinguishes between socks and tights, and another just records items of footwear. Another group records the fabric and size of each item, but not the colour. People record the information they value, or, sometimes, the information that they can imagine other people valuing as well as themselves.

But in an island of metadata, an isolated community, I can certainly make the claim, there are no colour pictures of Stokesay Castle. That’s useful to me, and I hereby make it.

Someone who searches my metadata, one might imagine, has the sense to understand that if I say there are no colour pictures of Stokesay Castle, I do not mean that there are no such things in the entire universe (or metaverse), but that I don’t have any such pictures. You might think that I could say, N is a whole number, and N is the number of such pictures I have, and N has the value zero. On my own island I can indeed say that, because I have is implied by the shores of my island. The implicit qualification on the Web is anyone might have, or have had, or will have, or can imagine.

So, in the XML world, we are an archipelago of weakly interconnected islands. This is not a bad thing, especially if you can swim. We need to make sure we have powerful tools for translating between our various vocabularies when that makes sense, and for searching across them. These are our bridges, our ferries, our ‘planes and water-paddles.

We build our bridges with XML interchange, and we paint and repair them with documentation and training, with education and outreach. We cross the bridges—that is, we translate between vocabularies, and search—with XSLT and XQuery, with SQL/X and other interchange apecifications. Those bridges must, in most cases, be built by hand. If your ontology has fortified buildings but does not distinguish between forts and castles, my search for a castle (a fortified habitation) that is not a fort (a fortified garrison) will be meaningless. But guess what? No-one else has solved that problem either.

So I live on my island in my archipelago, along with many other people, and sometimes we visit other islands for a time. And the world seems good. Yes, it’s hard to cross to those other islands, but no harder for us than anyone else. Easier, sometimes, because we’ve got pretty good bridges.