« The management of website content in UK universities - report available | Main | What researchers think about data preservation and access »

August 20, 2009

A socio-political analysis of the history of Dublin Core terms

No... this isn't one. Sorry about that! But reading Eric Hellman's recent blog post, Can Librarians Be Put Directly Onto the Semantic Web?, made me wonder if such a thing would be interesting (in a "how much metadata can you fit on the head of a pin" kind of way!).

Eric's post discusses the need for, or not, inverse properties in the Semantic Web and the necessary changes of mindset in moving from thinking about 'metadata' to thinking about 'ontologies':

In many respects, the most important question for the library world in examining semantic web technologies is whether librarians can successfully transform their expertise in working with metadata into expertise in working with ontologies or models of knowledge. Whereas traditional library metadata has always been focused on helping humans find and make use of information, semantic web ontologies are focused on helping machines find and make use of information. Traditional library metadata is meant to be seen and acted on by humans, and as such has always been an uncomfortable match with relational database technology. Semantic web ontologies, in contrast, are meant to make metadata meaningful and actionable for machines. An ontology is thus a sort of computer program, and the effort of making an RDF schema is the first step of telling a computer how to process a type of information.

I think there's probably some interesting theorising to be done about the history of the Dublin Core metadata properties, in particular about the way they have been named over the years and the way some have explicit inverse properties but others don't.

So, for example, dcterms:creator and dcterms:hasVersion use different naming styles ('creator' rather than 'hasCreator') and dcterms:hasVersion has an explicit inverse (dcterms:isVersionOf) whereas dcterms:creator does not (there is no dcterms:isCreatorOf).

Unfortunately, I don't recall much of the detail of why these changes in attitude to naming occured over the years. My suspicion is that it has something to do with the way our understanding of 'web' metadata has evolved over time.  Two things in particular I guess.  Firstly, the way in which there has been a gradual change from understanding properties as being 'attributes with string values' (very much the view when dc:creator was invented) to understanding properties as 'the meat between two resources in an RDF triple'.  And, secondly, a change in thinking first and foremost about 'card catalogues' and/or relational databases to thinking about triple stores (perhaps better characterised (as Eric did) as a transition between thinking about metadata as something that is viewed by humans to something that is acted upon by software).

I strongly suspect that both these changes in attitude are very much ongoing (at least in the DC community - possibly elsewhere?).

Note also the difference in naming between dcterms:valid and dcterms:dateCopyrighted (both of which are refinements of dcterms:date). The former emerged at a time when the prefered encoding syntaxes tended to prefix 'valid' with 'DC.date.' to give 'DC.date.valid' whereas the latter emerged at a time when properties where recognised as being stand-alone entities (i.e. after the emergence of Semantic Web thinking).

If nothing else, working with the Dublin Core community over the years has served as a very useful reminder about the challenges 'ordinary' (I don't mean that in any way negatively) people face in understanding what some 'geeks' might perceive to be very simple Semantic Web concepts.  I've lost track of the number of 'strings vs. things' type discussions I've been involved in!  And to an extent, part of the reason for developing the DCMI Abstract Model was to try to bridge the gap between a somewhat 'old-skool' (dare I say, 'traditional librarian'?) view of the world and the Semantic Web view of the world.  Of course, one can argue about whether we succeeded in that aim.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345203ba69e20120a55efb64970c

Listed below are links to weblogs that reference A socio-political analysis of the history of Dublin Core terms:

Comments

"a very useful reminder about the challenges 'ordinary' (I don't mean that in any way negatively) people face in understanding what some 'geeks' might perceive to be very simple Semantic Web concepts. "

I'd say that it's also a useful reminder that there may be fewer 'simple Semantic Web concepts' than one thinks, that even reasonable semweb geeks can disagree (while each thinking their own understanding is the obvious 'simple concept'), that the 'best practices' and 'state of the art' are evolving even among the geeks. It's a work in progress.

Yes, good point. I totally agree.

It seems to me that the reason the "isCreatorOf" property is missing is that it would be a property of a person (or corporate entity), rather than of an information resource, and DC has tended to shy away from modeling people. Whereas the inverse properties isVersionOf and hasVersion are both properties of documents.

So you're right - it's evidence of a document-centricity in DC as opposed to a (more desirable) attempt to model the bibliographic domain more broadly.

Conal,
yes, you are probably right - evidence of the 'document-like object' heritage of DC. I was thinking that the "a creator is just represented by a string of characters" mindset also played a part in our thinking at that time but maybe we didn't even get that far!

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad