The past two days, I’ve been participating in the SWAD Europe workshop on metadata in a multi-lingual world, mainly because of my work on the W3C Glossary project.
Regarding the W3C glossary project itself, the main issues I raised regarded the difficulties we stumbled upon when designing our data model for it; basically, using SKOS as a basis, you end up considering terms as a particular instanciation of a concept (the Ideas in Platon’s world) in a given language; but using this modelling, while it has definitive advantages, doesn’t allow to model very well some other aspects:
- you don’t really assert that “élément” is the French translation of “element” in English; rather, you assert that “element” and “élément” conveys the same meaning in the two languages; while this is mostly a feature, we lose the direct relationships between the two terms
- the platonician model hides the classical fact that concepts and languages can’t be that easily separated, since the language reflects the perception of the world (with the classical urban legen about the 27 words describing snow in some Eskimo language)
- in the particular context of W3C, the English glossaries have more weight than the other ones, since they are the only normative ones; this asymetry should be conveyed in some way, but isn’t
- the fact that a proposed translation is approved in some ways by W3C, besides some really hard social issues to get to that result, would need to be modeled too; more generally, the provenance of translations, their status, etc. raises issues of reification and n-ary relations modeled in RDF
Although the workshop didn’t come up with answers to these questions, it certainly helped formalize them and put them in a broader context. Plus, it raised many cool ideas of features to add to the Glossary; a good occasion to chat with various people -e.g. Morten) on some of the day to days issues with applying the Semantic Web to real world cases…