Mikael Nilsson announced earlier this week the availability of a document produced by the EC-funded ProLearn project, with the title Harmonization of Metadata Standards, edited by Mikael with contributions from Ambjörn Naeve, Erik Duval, David Massart & myself (though I have to admit my own direct input to this paper was quite limited!).
The document analyses a number of metadata standards and seeks to elucidate the principles and frameworks which underpin those standards, and to highlight that it is the differences and incompatibilities in those principles and frameworks which ultimately create obstacles to the development of systems working across multiple standards. Until we meet the challenge of addressing these contradictions, by "harmonizing" our metadata standards, the effective exchange of metadata instances between systems based on different standards will always be fraught with difficulty.
The paper concludes with a "manifesto" of concrete points of action for the harmonization of metadata standards generally, with specific reference to the case of the IEEE Learning Object Metadata (LOM) standard and Dublin Core, in five areas:
- Identification: The use of URIs as globally scoped identifiers for metadata terms.
- Abstract Models: The synchronisation of standards at the level of their abstract models, rather than through (complex, lossy) mapping between instances of different, often incompatible, abstract models.
- Vocabulary Models: Closely related to the previous point, since the type of metadata term to be described is determined by features of the abstract model, alignment of the ways "element vocabularies" are described, with a recommendation to use RDF Schema. (I think I would have liked to see a bit more qualification/elaboration of this point, and emphasis of the dependency on an RDF-compatible abstract model: the solution isn't, IMHO, as straightforward as producing an RDFS property description corresponding to each "element" of a vocabulary which was constructed for use in the context of a tree-based model - my old "hobby horse" that a "LOM data element" is quite a different sort of thing from a "Dublin Core element".)
- Application Profile Models: A shared understanding of what constitutes a metadata application profile.
- Metadata formats: Syntaxes must be grounded in the abstract model(s): it is the model which drives the representation in a concrete syntax.
The paper reprises and refines some of the themes that have been addressed in earlier papers (e.g. a paper at DC-2006 on metadata frameworks and a book chapter written around the same time), but I think it provides a nice distillation of those ideas, brings in some of the current context (including the sort of informal, "subjective" metadata surfaced in many "Web 2.0" contexts and Erik's recent work on "attention metadata"), and extends them to guidance to standards developers on some practical steps for action.
The paper concludes - and here I can hear the characteristically resilient and upbeat voice of Mikael, who is always keen to point out to me that the glass I see as half-empty is in fact half-full! :-) - :
Together, these two initiatives [the IEEE LOM/Dublin Core harmonization effort and the Resource Discovery & Access (RDA) work in the librray community], both of which include important contributions from ProLEARN members, demonstrate important progress towards harmonization of several important metadata domains – generic metadata using Dublin Core, educational metadata, and library metadata, as well as a widening from the all-digital domain to the domain of physical artefacts (books).
Harmonizing metadata specifications in the way outlined in this document seems an overwhelming task, but the steady flow of important developments still makes the future seem bright.