On being niche
I spoke briefly yesterday at a pre-IDCC workshop organised by REPRISE. I'd been asked to talk about Open, social and linked information environments, which resulted in a re-hash of the talk I gave in Trento a while back.
My talk didn't go too well to be honest, partly because I was on last and we were over-running so I felt a little rushed but more because I'd cut the previous set of slides down from 119 to 6 (4 really!) - don't bother looking at the slides, they are just images - which meant that I struggled to deliver a very coherent message. I looked at the most significant environmental changes that have occurred since we first started thinking about the JISC IE almost 10 years ago. The resulting points were largely the same as those I have made previously (listen to the Trento presentation) but with a slightly preservation-related angle:
- the rise of social networks and the read/write Web, and a growth in resident-like behaviour, means that 'digital identity' and the identification of people have become more obviously important and will remain an important component of provenance information for preservation purposes into the future;
- Linked Data (and the URI-based resource-oriented approach that goes with it) is conspicuous by its absence in much of our current digital library thinking;
- scholarly communication is increasingly diffusing across formal and informal services both inside and outside our institutional boundaries (think blogging, Twitter or Google Wave for example) and this has significant implications for preservation strategies.
That's what I thought I was arguing anyway!
I also touched on issues around the growth of the 'open access' agenda, though looking at it now I'm not sure why because that feels like a somewhat orthogonal issue.
Anyway... the middle bullet has to do with being mainstream vs. being niche. (The previous speaker, who gave an interesting talk about MyExperiment and its use of Linked Data, made a similar point). I'm not sure one can really describe Linked Data as being mainstream yet, but one of the things I like about the Web Architecture and REST in particular is that they describe architectural approaches that haven proven to be hugely successful, i.e. they describe the Web. Linked data, it seems to me, builds on these in very helpful ways. I said that digital library developments often prove to be too niche - that they don't have mainstream impact. Another way of putting that is that digital library activities don't spend enough time looking at what is going on in the wider environment. In other contexts, I've argued that "the only good long-term identifier, is a good short-term identifier" and I wonder if that principle can and should be applied more widely. If you are doing things on a Web-scale, then the whole Web has an interest in solving any problems - be that around preservation or anything else. If you invent a technical solution that only touches on scholarly communication (for example) who is going to care about it in 50 or 100 years - answer, not all that many people.
It worries me, for example, when I see an architectural diagram (as was shown yesterday) which has channels labelled 'OAI-PMH', XML' and 'the Web'!
After my talk, Chris Rusbridge asked me if we should just get rid of the JISC IE architecture diagram. I responded that I am happy to do so (though I quipped that I'd like there to be an archival copy somewhere). But on the train home I couldn't help but wonder if that misses the point. The diagram is neither here nor there, it's the "service-oriented, we can build it all", mentality that it encapsulates that is the real problem.
Let's throw that out along with the diagram.
I totally agree we (digital libraries) are often too insular, but I'm also starting to notice that part of the reason is because we just ARE niche. In particular, that we are different as we are always looking at the long game (and often the really looooong game, as in 'in perpetuity').
We're about consistency of access. We've watched fads come and go, so we tend to wait until a technology has proven itself before commiting to it.
I think the bleeding-edge-rs think "if we run up against an issue, we'll just stick a band aid on it", whereas we're more at the "take a dose of preventative medicine and call me in the morning" end.
This hit me soundly on the head when reading Jeni Tennison's awesome worked example on creating Linked Data - http://www.jenitennison.com/blog/node/136
Personally, I wouldn't stop looking until I found a stable (preferably RESTful) URI to define the UK, but she was quite happy to use a canned search URI (http://statistics.data.gov.uk/id/country?name=England ) and hope for the best. If she found a stable URI later, she'd just bung an owl:sameAs on.
It's this seat-of-the-pants practicality that got HTML underway, but we could already see where it was going and that its lack of semantics would be a stumbling block eventually - that was the motivation behind DC and then the SW.
I think LD is great, we've now got some much easier protocols that will give fast takeup, but again, we can see a day the band aids are going to require a triple bypass.
I'm not sure if it's arrogance or wisdom.
Posted by: Douglas Campbell | December 03, 2009 at 09:04 PM
Hey Douglas, good to hear from you.
Yes, I agree. Libraries/archives/scholars/whatever do have niche requirements... but (and it's a big but - I like big buts and I cannot lie) as scholarly communication slips into the use of more mainstream tools we have to find ways of either bringing those niche requirements and solutions into mainstream practice or acknowledge that our impact is reduced. We (whether that is 'we' as educational institutions or libraries or archivists or whatever) no longer own the space of services that our 'customers' choose to use.
How long have librarians been arguing that libraries have niche requirements for 'search' that aren't met by the likes of Google? How long have ordinary people (yes, that includes researchers) said that Google is actually good enough in many/most cases?
We can argue that niche requirements are important but we have to argue the case well enough to have an impact on mainstream tools - not use it as an excuse to build our own silo'd areas of the Web?
On 'arrogance vs wisdom', I agree with your sentiment. I'm very conscious that I tend to come across in a fairly forthright way, especially when I give this kind of presentation. I can't help myself. But my heart is in the right place and I usually speak from a point of view that says I know less than most of the people I'm talking to or writing for.
I'm always happy to be told I'm talking rubbish. It's a good way of learning!
Posted by: AndyP | December 04, 2009 at 12:46 PM
I agree with the criticism that "digital library activities don't spend enough time looking at what is going on in the wider environment". I think that in terms of information architecture if you look at much of the activity in libraries, it is as if the web never happened.
I would say that there are aspects of 'niche' to what we do - no doubt. But at the moment disentangling the real 'niche' stuff from the stuff we just happen to do in a niche way (and this applies not just to the technology, but to broader things - e.g. the historical practices that surround scholarly communication).
Although I agree that 'open access' seems slightly tangential to this, I think there is a link here. It seems to me that one of the things that often leads us away from the mainstream is our need to deal with protected resources (and especially those available through multiple locations) - it seems this is something that is not 'typical' for the web.
Perhaps another difference (and this has only occurred to me as I'm typing so it could be rubbish) is that we attempt to be an intermediary between a user and a resource - this seems atypical to me in terms of most of the rest of the web and user interactions with it?
I'm less convinced that 'looking at the long game' means getting everything right before we go anywhere. I'd argue strongly that doing something now but always being prepared to adjust your approach is the only good longterm strategy. In terms of the example of a stable RESTful URI for England, I'd say that if you really aren't happy with the available URIs then you could always coin one yourself - and further I'd argue this is exactly the type of thing libraries ought to be in a position to do...
Enough for now :)
Posted by: Owen Stephens | December 04, 2009 at 01:39 PM
Douglas' point is well-taken but there are some other problems, too:
(1) Where do librarians get experience with the wider environment (i.e. as software engineers)?
(2) Where do software engineers get experience with cultural heritage needs?
and
(3) If qualified personnel exist how do cultural heritage organizations have
(a) the funds to hire these folks
and
(b) the quality of management to keep them?
Through code4lib I know a number of really talented people in the intersection of libraries (often digital libraries) and programming (one of several technical roles needed to keep digital libraries going).
I'd like to see broader exposure on both sides.
Libraries and information science have a lot to offer in terms of taxonomies and a philosophy of knowledge organization. I do hope the cross-exchange will happen more, but I think a cultural shift might be needed!
Posted by: Jodi Schneider | December 04, 2009 at 06:29 PM
my pet-peeve niche-requirement which needs reconsideration is "fine-grained access control". for some reasons, locking things away seems to be the default. whenever i read requirements like "we use flickr as inspiration but will build our own service" my programmer heart sinks - imagining the long list of feature requests inspired by an existing service that'll have to be reimplemented.
and like @owen i think "looking at the long game" may sometimes just mean introducing imaginery requirements. i tend to extend the "the only good long term identifier is a good short term identifier" principle to "long term preservation can be done one day at a time" - cause as we all know: tomorrow never happens.
Posted by: robert forkel | December 07, 2009 at 02:30 PM
Owen, I like your comment that for GLAMs "it is as if the web never happened". But I think Jodi answered it well - there are not enough people able to put a foot in both camps.
To get into the library tech world you need to be pretty clued up on things like MARC, OAI-PMH, SRU, METS, EAD, etc. (which, let's face it, have a pedigree that pre-dates the Web). Usually it takes a personal interest to motivate you into looking at the web way of doing things, and that involves unlearning all that library indoctrination and re-learning to think in terms of REST, RDF, URIs, OAuth, etc. Then on top of you need to work out how to map/translate from one world to the other.
I try to get as much headspace time as I can in the web world, but there's so much library-world work I have to do that it keeps gets pushed to the back burner. :(
I'm sure Andy's complaint could be aimed at many sectors, not just the GLAM sector. But it's good to have it pointed out.
Robert, I'm not sure if by "locking things away seems to be the default" you mean restrictive re-use statements. If so, this is something we're starting to discuss - http://librarytechnz.natlib.govt.nz/2009/12/look-at-rights-in-cultural-heritage.html (see the comments) and http://librarytechnz.natlib.govt.nz/2009/12/what-is-inappropriate-re-use.html
Posted by: Douglas Campbell | December 11, 2009 at 12:38 PM
Ha, and here's proof GLAMs aren't the only sector failing in the transition due to a shortage of people able to bring the two sides together - Mashable just looked at the need for journalist/programmers in the media industry: http://mashable.com/2009/12/11/programmer-journalists/
Posted by: Douglas Campbell | December 12, 2009 at 10:49 AM
Many institutions limit access to their online information. Making this information available will be an asset to all.
Posted by: Research Paper Help | December 15, 2009 at 10:50 AM