« How do you manage yours? | Main | The rise of green? »

June 19, 2009

Repositories and linked data

Last week there was a message from Steve Hitchcock on the UK jisc-repositories@jiscmail.ac.uk mailing list noting Tim Berners-Lee's comments that "giving people access to the data 'will be paradise'". In response, I made the following suggestion:

If you are going to mention TBL on this list then I guess that you really have to think about how well repositories play in a Web of linked data?

My thoughts... not very well currently!

Linked data has 4 principles:

  • Use URIs as names for things
  • Use HTTP URIs so that people can look up those names.
  • When someone looks up a URI, provide useful information.
  • Include links to other URIs. so that they can discover more things.

Of these, repositories probably do OK at 1 and 2 (though, as I’ve argued before, one might question the coolness of some of the http URIs in use and, I think, the use of cool URIs is implicit in 2).

3, at least according to TBL, really means “provide RDF” (or RDFa embedded into HTML I guess), something that I presume very few repositories do?

Given lack of 3, I guess that 4 is hard to achieve. Even if one was to ignore the lack of RDF or RDFa, the fact that content is typically served as PDF or MS formats probably means that links to other things are reasonably well hidden?

It’d be interesting (academically at least), and probably non-trivial, to think about what a linked data repository would look like? OAI-ORE is a helpful step in the right direction in this regard.

In response, various people noted that there is work in this area: Mark Diggory on work at DSpace, Sally Rumsey (off-list) on the Oxford University Research Archive and parallel data repository (DataBank), and Les Carr on the new JISC dotAC Rapid Innovation project. And I'm sure there is other stuff as well.

In his response, Mark Diggory said:

So the question of "coolness" of URI tends to come in second to ease of implementation and separation of services (concerns) in a repository. Should "Coolness" really be that important? We are trying to work on this issue in DSpace 2.0 as well.

I don't get the comment about "separation of services". Coolness of URIs is about persistence. It's about our long term ability to retain the knowledge that a particular URI identifies a particular thing and to interact with the URI in order to obtain a representation of it. How coolness is implemented is not important, except insofar as it doesn't impact on our long term ability to meet those two aims.

Les Carr also noted the issues around a repository minting URIs "for things it has no authority over (e.g. people's identities) or no knowledge about (e.g. external authors' identities)" suggesting that the "approach of dotAC is to make the repository provide URIs for everything that we consider significant and to allow an external service to worry about mapping our URIs to "official" URIs from various "authorities"". An interesting area.

As I noted above, I think that the work on OAI-ORE is an important step in helping to bring repositories into the world of linked data. That said, there was some interesting discussion on Twitter during the recent OAI6 conference about the value of ORE's aggregation model, given that distinct domains will need to layer their own (different) domain models onto those aggregations in order to do anything useful. My personal take on this is that it probably is useful to have abstracted out the aggregation model but that the hypothesis still to be tested that primitive aggregation is useful despite every domain needing own richer data and, indeed, that we need to see whether the way the ORE model gets applied in the field turns out to be sensible and useful.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345203ba69e20115711f5d11970b

Listed below are links to weblogs that reference Repositories and linked data:

Comments

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad