« The REST book is out! | Main | Preliminary Programme for DC-2007 announced »

June 04, 2007

The Repository Roadmap - are we heading in the right direction?

I've been asked to provide the opening slot at the JISC "Digital repositories: Dealing with the digital deluge" conference in Manchester, starting tomorrow.  My slides are now up on Slideshare.

I'm going to start with a fairly boring overview of the Repositories Roadmap that Rachel Heery and I wrote for the JISC last year (you can have a lie in if you like!) followed by some discussion around the way that our environment is changing, largely because of Web 2.0.  The intention is to ask whether we need to adjust the roadmap to take account of those changes.

The roadmap is now one year old, and was written to paint a picture up until 2010.  So we are roughly 25% of the way there - a useful opportunity to look back and see how we are doing.  Re-reading the roadmap now, I think we did a pretty good job and there's not much that I would strongly take issue with.
Oddly though, the document makes precious little mention of the Web.  One might argue that there was no need to state the obvious - or that the Web was just a given.  But I'm not so sure.  One of the things I want to argue in the presentation (though I know that this is something that Rachel, my roadmap co-author, strongly disagrees with) is that, from the perspective of consumers, repositories are just Web sites.  Somehow, it almost feels like heresy to say so - I don't know why!?  But conceptualising them as such, changes the emphasis I think.  It pushes things like information architecture, the Web architecture, Google, accessibility, usability, URIs and so on to the fore - metadata, OAI and the like seem to become less important.  To me anyway. Perhaps I'm just strange! :-)

These are not straightforward issues, and I don't pretend to have any answers - but that doesn't make the question any less pertinent or interesting.  In fact, I'm very mindful of the tension between the relatively complex, essentially Semantic Web, metadata modeling issues being addressed by activities such as the OAI ORE project and the ePrints Application Profile work and the relatively simple, tag-based, approaches taken by Web 2.0 repository-like applications such as Slideshare and Scribd.

Unfortunately, I lean uncomfortably in both directions!


TrackBack URL for this entry:

Listed below are links to weblogs that reference The Repository Roadmap - are we heading in the right direction?:


Hi Andy
I hope the participants will be able to differentiate between your boring overview and the exciting discussion :-)
Your slides are very interesting, and I've added some comments on Slideshare.
Isn't it strange, though, that it seems to be heresy to talk about user-focussed services, observing services that users like to use and learning from others, rather than the top-down approach based on gaining EU support, getting funding bodies to mandate our views and forcing particular technological policies on the user community?
Hope the meeting goes well.


hi Andy,
I don't take any issue with what you write in the above. All is good re your web-centric perspective. I agree it is the only way to go. But, without trying to be defensive (really), I do not get the point you are making re "OAI becoming less important". OAI (I think you mean OAI-PMH) fits under the Resource Discovery category of this blog. Just like RSS, and Sitemaps do. Nothing more, nothing less. And I think resource discovery is and remains important and the OAI-PMH offers one approach to allow for batch discovery of resources. Unfortunately, not because of some flaw in OAI-PMH (I think), but rather because of ambiguities in unqualified Dublin Core (or in the implementation thereof) regarding referencing actual resources by means of their URIs, many OAI-PMH harvested records turn out to be of little use to the major search engines. As far as I understood from discussions with Google people, this is the major reason why they do not promote (do not read "do not use") the OAI-PMH as a way to discover resources. Replace unqualified Dublin Core by some more meaningful resource description approach (I am, for example, thinking OAI-ORE serializations of named graphs; see Pete's entry), and I think that OAI-PMH still has quite something to offer in the realm of resource discovery.
Apart from this, I would like to also agree with Pete when he suggests that the eventual OAI-ORE approach will most likely not be complex. The theory may look complex at this moment, I think the practice should be relatively simple. I think we agree that simplicity is a major factor when it comes to getting buy-in for interoperability specs. We have learned lessons from the past.

Andy, you mention in your post that I would strongly disagree with your statement "that, from the perspective of consumers, repositories are just Web sites." By now you know I am capable of putting up an argument against most things you say... but in this case I would want to clarify my disagreement a little.

In my view (and this is what I have argued in the past) it is significant that institutional repositories are 'well-managed' and for this reason have a level of sustainability and trustworthiness over and above an individual academic's or even a department's Web site. There are a number of actors involved with 'repositories' - depositors of content; searchers and users of content; repository administrators (ultimately the institution). That the repository is 'well-managed' (sustainable, backed up by institutional mandates, trusted) is an important characteristic which should encourage in particular the depositor to populate the repository and the administrator to keep content safe.

To that extent I think the repository is a particular sort of 'Web site', it has institutional commitment to keep it up-to-date and high quality.

Over and above that characteristic, I would suggest the manner in which the 'repository' interfaces with both the depositor and the searcher (both of whom might be considered as consumers I think?) can be as much Web 2.0 as you like....

What I think is an issue is your statement in your presentation slide 33 that repositories 'don't do preservation'. I know this is a contentious topic, but I believe that institutional repositories might be considered at least to do 'medium-term' preservation over say a 20 year perspective, if not longer? I think that is why funders and institutions are investing in them, why depositors bother to deposit etc?

Perhaps there is potential for institutions to push out their repository content to other services that have a more up to minute Web interface? This would not need to be a long term commitment and would enable institutions to cater in a more targeted way to their particular 'consumers'.

Of course, as we have both argued in the past the Eprints Application Profile need not result in metadata that is any more complex for the end user to create. In fact a relatively complex underlying model might make it less so. We need to stop leaning in different directions and start pulling together the data that is already there, in the document, in institutional systems and IE services, alongside author-created metadata and those simple (and often inadequate!) tagged-based approaches. Make the author create less metadata and in the process show them how clever repositories can really be.

The comments to this entry are closed.



eFoundations is powered by TypePad