How uncool? Repository URIs...
I've been using the OpenDOAR API to take a quick look at the coolness of the URIs that people in the UK are assigning to their repositories. Why does coolness matter? Because uncool URIs are more likely to break than cool URIs and broken URIs for the content of academic scholarly repositories will probably cause disruption to the smooth flow of scholarly communication at some point in the future.
So what is an uncool URI? An uncool URI is one that is unlikely to be persistent, typically because the person who first assigned it didn't think hard enough about likely changes in organisational structure, policy or technology and the impact that changes in those areas might have on the persistence of the URI into the future.
In short - URIs don't break... people break them and, usually, the seeds for that breakage are sown at the point that a URI is minted.
OK, so first... hats off to the OpenDOAR team for providing such an easy to use API, one which made it simple to get at the data I was interested in - the URIs of the home pages of all the institutional repositories in the UK - by using the following link:
This provides a list of 107 repositories (as of 23 Feb 2009) as an XML file. Here's just the URIs of the repository home pages, broken out into the first part of the domain name, the institutional part of the domain name, the port, and the rest of the path (as a csv-separated file for easy loading into a spreadsheet).
In the following analysis, I'm making the assumption that the URI of the repository home page is carried thru into the URIs of all the items within that repository and that, if the home page URI is uncool then it is likely that the URIs for everything within that repository will be likewise. This feels like a reasonable assumption to me, though I haven't actually checked it out.
So... what do we find?
Looking at the first part of the domain name, we find 7 institutions using 'dspace' as part of the repository domain name and 35 using 'eprints'. Both are, presumably, derived from the technology being used as the repository platform. Building this information into the URL is potentially uncool (because that technology might well be changed in the future). Now, in both cases, I suspect that institutions might argue that they would stick with their use of 'eprints' and 'dspace' (particularly the former) even if the underlying technology was to change (on the basis that these terms have become somewhat generic). I'm not totally convinced by that argument, though I think it holds more water in the case of 'eprints' than it does in the case of 'dspace' but in any case, I would argue that this is something definitely worth thinking about.
Note that 10 institutions (with some cross-over between the two counts) have built 'dspace' into the path part of the repository URL, which is uncool for the same reasons.
3 institutions have built a non-standard port (i.e. not port 80) into their repository URL. Whilst this isn't necessarily uncool, it does warrant a question as to why it has been done and whether it will cause maintenance headaches into the future.
Looking at the path part of the URLs, 3 institutions have built the underlying technology (.htm, .php and .aspx) into their URLs - again, this is uncool because of the likelihood of future changes in technology.
A small number of institutions have built the library into their repository URLs. This is probably OK but reflects a commitment to organisational structure thaat may not be warranted longer term?
Similarly, a larger number have built a, err..., 'jazzy' project name into their repository URL. I would have thought it might be safer to stick to 'functional' labels like 'research' than, say 'opus', at least for the URLs, since this seems less likely to change because of political or other organisational issues into the future.
Finally, 4 institutions have outsourced their repository to openrepository.com, resulting in URLs under that domain. Outsourcing is good (I say that not least because I work for a charity that is in that business!) but I would strongly suggest outsourcing in a form that keeps your institutional domain name as part of the URL so that your URLs don't break if your host goes bust or you decide to move things back into the institution or to another provider.
Overall then, it's another 'could try harder' mid-term report from me to the Repository 101 course members.