« December 2006 | Main | February 2007 »

January 31, 2007

Dublin Core and scholarly publications

Writing the previous entry about Dublin Core and content packaging reminded me that an article by Julie Allinson, Pete Johnston and me about developing a Dublin Core application profile for scholarly works has now been published in Ariadne.

There's nothing here that hasn't previously been available in the project Wiki.  On the other hand, it's very nice to have it all written up in a single document.  The content of the Wiki, being a working resource, can be a little hard to navigate.

I hope that the publication of this article will stimulate some useful discussion, particularly around the use of FRBR as the basis for the model.  I find the use of FRBR in this context compelling, and its combination with the Dublin Core Abstract Model provides a very powerful description framework for bibliographic resources.  I'd be interested in hearing others' views.

Dublin Core as a packaging standard

I've been asked to speak at a Content Packaging for Complex Objects: Technical Workshop being organised by UKOLN in Bristol tomorrow about the DCMI Abstract Model.

I've put my slides up on Slideshare.

Now, you may think that the Dublin Core has nothing to do with content packaging (it's just metadata after all) but I tend to disagree.  As I will try and argue in the presentation, the DCMI Abstract Model provides a highly flexible and extensible framework for bundling together groups of objects, albeit usually 'by-reference'.  I will make the point that packaging standards tend to be very good at capturing the structural relationships between the component parts of compound objects, but that the Abstract Model's firm basis in RDF means that it can capture the richer semantics between the digital, physical and conceptual components that make up the complex objects we tend to find in our digital libraries.

The key point, I guess, is that content packaging is just metadata!

Journal of Information Literacy launched

The Journal of Information Literacy has just been launched...

JIL is an international, peer-reviewed, academic journal that aims to investigate Information Literacy (IL) within a wide range of settings.  Papers on any topic related to the practical, technological or philosophical issues raised by the attempt to increase information literacy throughout society are encouraged. JIL is published in electronic format only and is an open-access title.

I note this partly because Eduserv developed and hosts the Information Literacy Web site, through which this journal is delivered, and partly because it covers what appears to me to be quite an important area.

It'll be interesting to see how it develops over time.

The bids are in...

Our call for proposals closed earlier this week and I've just finished a first (and very brief) pass through all the bids.

We received 88, far more than we were expecting, of which it would appear that about 50 are in the area of the teaching and learning opportunities offered by 3-D virtual worlds (most of which relate to Second Life), about 30 are in the area of new approaches to elearning based on Web 2.0.  There are also a small smattering of bids around access and identity management.

OK, so now we know which are the sexiest areas! :-)  What else does this tell us?  Well it seems to me that, rightly or wrongly, there is a significant interest within the community in investigating the opportunities afforded by Second Life and its ilk.  Unfortunately, we are only able to fund a small proportion of the bids we have received.

Now we need to set aside about a week to read them all!

January 30, 2007

More ORE

This is just a very brief follow-up to my earlier post to note that the report of the OAI ORE TC meeting held in New York City a couple of weeks ago is now available.

Also recently posted to the OAI ORE Web site are the slides of the presentation that Herbert and Carl gave last week at the Open Repositories 2007 conference in San Antonio, Texas.

As the presentation emphasises, the model described there is a preliminary version, and I'm fairly sure there will be further modifications/enhancements/refinements required as it is tested against some examples and use cases, but I was pleased that the meeting made a commitment to aligning the ORE work firmly with the principles described in the W3C's Architecture of the World Wide Web document.

On a related (well, slightly related!) note, also published in the last few days was the programme for a W3C workshop titled "Web of Services for Enterprise Computing: Can the Web fulfill industry and business requirements?" Although I haven't had time to read them all, the set of position papers (linked from the programme) look like interesting reading. The paper by Noah Mendelsohn on behalf of the W3C Technical Architecture Group walks through a use case from the perspectives of the Web Architecture and of Web Services; and the paper by Nick Gall of Gartner has attracted some attention for its firm conclusion:

It is my position that the W3C should extricate itself from further direct work on SOAP, WDSL, or any other WS-* specifications and redirect its resources into evangelizing and standardizing identifiers, formats, and protocols that exemplify Web architectural principles. This includes educating enterprise application architects how to design "applications" that are "native" web applications.

January 25, 2007

Flickr Machine Tags and API changes

A while ago I wrote about a structured tag convention that I use to add Dublin Core metadata to items in my del.icio.us bookmark collection (and other similar collections). e.g. I use the pair of tags 

dctagged dc:creator=Berners-LeeTim

to indicate that a resource is created by Tim Berners-Lee.

I noted then that this approach had been based on other structured tagging (or perhaps more accurately "triple tagging") conventions, particularly on GeoTagging, which also uses a "qualified name"/"namespacing" convention in its tag structure.

There are limitations, of course, in how much of that "sub-tag" structure I can exploit through the del.icio.us browser interface and APIs. I can browse/search by full tag, but I can't easily do queries like:

  • "find all items for which the value of the dc:creator property is an entity whose name begins with 'Powell'" or
  • "find all items related (by any property) to the entity whose name is 'PowellAndy'"

So I'm very interested to see an announcement yesterday by Flickr (discovered via a post by Danny Ayers) which describes an enhancement to the Flickr API which does allow an application to construct queries on such structured tags, which Flickr are referring to as "machine tags". In their list of example queries, they include some examples using a "dc" namespace in i.e. these queries could be applied to a set of items tagged using the conventions I described in my earlier post (though I think you have to re-save existing items in order for Flickr to take existing tags that look like "machine tags" and process them as "machine tags", if you see what I mean!):

  • Find photos using the 'dc' namespace :
       {"machine_tags" => "dc:"}
  • Find photos with a title in the 'dc' namespace :
       {"machine_tags" => "dc:title="}
  • Find photos titled "mr. camera" in the 'dc' namespace :
        {"machine_tags" => "dc:title=\"mr. camera\"}
  • Find photos whose value is "mr. camera" :
       {"machine_tags" => "*:*=\"mr. camera\""}
  • Find photos that have a title, in any namespace :
       {"machine_tags" => "*:title="}
  • Find photos that have a title, in any namespace, whose value is "mr. camera" :
       {"machine_tags" => "*:title=\"mr. camera\""}
  • Find photos, in the 'dc' namespace whose value is "mr. camera" :
       {"machine_tags" => "dc:*=\"mr. camera\""}

Whereas I had proposed the use of a dctagged tag as an implicit "namespace declaration" (for both the dc and dcterms "namespaces") - much like the geotagged tag in GeoTagging - the Flickr document incorporates an explicit "namespace declaration" convention, borrowing directly the conventon used in XML Namespaces i.e. using tags like xmlns:dc=http://purl.org/dc/elements/1.1/, which would enable different users to map the dc prefix to different "namespace URIs" if they wished to do so.

Leaving aside the DC-specific aspect, I think the more general point is captured by Dan Catt of Flickr in a post on the geobloggers weblog:

However, the take away point for me here … [disclaimer: I work for Flickr] … is that a massive site (Flickr) with over 3 billion photos and goodness knows how many tags to go along with that. Have modified their databases, tweaked code and written APIs to allow and encourage developers to use Machine Tags. Which to me, marks a recognition and evolution in Tags and how they can be used.

Flickr already has good support for the creation and use of geotags. It'll be interesting to see what services emerge based on other machine tag sets. Exciting stuff.

January 17, 2007

Prospecting for ORE

At the end of last week, I attended the first meeting of the - deep breath - Open Archives Initiative Object Reuse and Exchange Technical Committee (that's OAI ORE TC from now on!). As you can see from that page, I'm not actually a member of the committee, but Andy wasn't able to attend this time, and asked me to deputise.

ORE is a recently launched project within the OAI, and is funded by the Mellon Foundation, with some additional support from the National Science Foundation. The TC meeting was held at the Butler Library of Columbia University on the Upper West Side of Manhattan, and was chaired/facilitated by Carl Lagoze and Herbert Van de Sompel as co-ordinators of the project.

As I had anticipated, it turned out to be a fairly dense couple of days with a good deal of (occasionally heated, but always well-argued and good-humoured!) debate, deftly chaired by Herbert and Carl. On the first day, they gave a quite in-depth overview of the project, based on a paper that had been circulated to us in advance of the meeting, and then invited us to present some comments in response. I presented a few slides in which I argued for a "resource-oriented" approach to providing the key repository functions that had been outlined. Other members of the committee offered their own perspectives on the problem space.

For the second day of the meeting we focused in on some more specific elements of the activity, considering aspects of scope, the "audiences" for the ORE deliverables, and identifying some use cases that should be developed. In the afternoon we turned our attention to the data model that will form the basis of whatever specifications ORE develops or recommends. It seems to me this is probably one of the key components of the ORE effort, and it became clear during our discussions that one of the challenges is developing a model which is sufficiently specific to support the functions required while remaining sufficiently generic to be useful to a wide range of implementers working with different resource types and their own application-level data models.

Herbert and Carl are working on distilling the results of the meeting and will be presenting them in a plenary session at the Open Repositories 2007 conference in San Antonio, Texas next week.

January 16, 2007

Repositories and OpenID

This is a short post... it should be longer, but I don't have time to flesh out my thoughts properly and I wanted to at least get something down on paper (so to speak).

Pete and I recently met up with Traugott from UKOLN for a quick drink.  Just catching up really, but in our conversation we got round to talking about name authority in the context of scholarly publishing and institutional repositories.  In a rash moment I asked, "Why don't we just use OpenIDs as author identifiers in institutional repositories?".

I'm not sure that Traugott was very impressed... possibly for good reason!  He was particularly concerned about legacy issues for example.  But it seems to me that the idea shouldn't be dismissed totally out of hand.  I've felt for some time now that any centralised approach to name authority is pretty much doomed to failure for all sorts of reasons that I won't go into here.  I've had at the back of my mind that one might be able to build a distributed solution using LDAP, i.e. based on the LDAP servers maintained by institutions.  But it seems to me that using OpenIDs has some significant advantages:

  • Firstly, academics don't consider institutional repositories, or even academic services, to be their sole focus of attention.  They are equally interested in all sorts of Web-based services alongside the stuff delivered by their institution and the more formal academic services that they get access to by virtue of being members of an institution.  Therefore, as I've argued before, any access and identity management solution needs to work across the whole range of services that academics are interested in if it is to be compelling.
  • Secondly, academics have an online life before and after their academic career and they move between institutions during it, so anything that is tied too closely to a particular institution is problematic.

OpenID is nice because it is so distributed, open and flexible.  Academics wouldn't be forced to use the identity given them by their current institution, though they might choose to for various reasons.  But an external identity offered by a third-party would be able to migrate across institutions seamlessly and so has some significant advantages.

Note that academics would not be forced to use a single OpenID for everything they do if they didn't want to.  For example, they might choose, for privacy reasons, to use different OpenIDs for academic and non-academic services.  But clearly they would have the option of using a single OpenID if they wanted to, and doing so would carry with it significant advatages in terms of single sign-on.  Overall, the flexibility of OpenID, and its decentralised approach, would theoretically leave the end-user far more in control of how their online identity(ies) were used.

OK, so it's still relatively early days in the OpenID story, but let's say that OpenID becomes the normal way in which your average academic identifies him/herself to the range of external Web 2.0 tools they are interested in using (their blog and so on).  Wouldn't it make sense for them to also use an OpenID as an author identifier?  The same OpenID that they use to log into their institutional repository.

Wouldn't that be cool?

January 11, 2007

ALT-C 2007 deadline reminder

I'm on the programme committee for this year's ALT-C, so just doing my duty you understand...

ALT-C 2007: Beyond control, Learning technology for the social network generation, 4-6 September 2007, Nottingham, England.

The online paper submission system for ALT-C 2007 is now open.  Please read the submission guidelines for Research Papers and for Abstracts and download the Research Paper Template if you wish to submit a research paper.

Key deadlines for Research Papers and Abstracts:

  • Last date for Research Papers: 14 February 2007.
  • Last date for Abstracts for short papers, symposia, workshops, demonstrations and posters: 28 February 2007.

UK Schools Minister sets out his vision for educational ICT

Jim Knight, the UK Minister of State for Schools, spoke yesterday at BETT about the way that ICT is set to transform education.  A transcript of his speech is available from PublicTechnology.net Web site.

It's not a bad speech, in the sense that it is hard to find fault with much of what is said.  If I have one criticism it is that there is not enough focus on how schools are going to prepare themselves strategically for responding to changes in technology over the next 5 to 10 years.  In part that means thinking about the changing global context (not just James Bond!) and what kind of skills are going to be important for people (staff and pupils) over that kind of time frame.

I'd be interested to know how many schools have really sat down and thought strategically about how they will be supporting and delivering learning for children in 5 years time, what ICT will allow them to do that they don't do now, what kinds of things children will expect and demand in that time frame, what kinds of skills will be valuable, what kind of sustainable ICT infrastructure (not just hardware) they need to be moving towards, what staff skills will be needed and so on.  The talk touches a little bit on how collaboration and assessment are changing the way children are learning - but it seems to me to be more about highlighting individual, and probably somewhat isolated, flashes of excellence rather than a coherent vision of where all schools are going.

In my limited experience, this kind of thinking is very difficult for individual schools to undertake - not least because the teachers at the coalface need time and space to think about the issues and gain experience of what is being made possible by technology.  And time and space in schools are scarce resources.  Schools will need all the help they can get in preparing for these kinds of changes IMHO.

January 09, 2007

When two identifiers for the price of one is not a good deal

I came across a December posting about CrossRef and DOIs on the DigitalKoans blog the other day.  The posting was primarily about the then newly announced CrossRef DOI Finding Tool, but the 'http' URI pedant in me I couldn't help but notice something else.

Look at the example search results scattered throughout the posting.  In every case, each DOI is displayed twice, firstly using the 'doi:' form of URI, then using the 'http:' form.

What is that all about?  Why oh why have we lumbered ourselves with a system that forces us to repeat the preferred form of an identifier in a form that is actually useful?  Why don't we just use the useful form, full stop!?

There is nothing, not one single thing, that is technically weaker or less persistent about using http://dx.doi.org/10.1108/00907320510611357 as an identifier rather than doi:10.1108/00907320510611357.  Quite the opposite in fact.  The 'http:' form is much more likely to persist, since it is so firmly embedded into the fabric of everything we do these days. 
Yet for some reason we seem intent on promoting the 'doi:' form, even though it is next to useless in the Web environment.  As a result, all implementors of DOI-aware software have to hard-code knowledge into their applications to treat the two forms as equivalent.

Note, this is not a criticism of the DOI per se... just a continued expression of frustration at people's insistence on ignoring the 'http' URI in favour of creating their own completely unnecessary alternatives.

January 07, 2007

Thinking about REST, Part 1: Developing a Resource-Orientation

Prompted partly by some of Andy's earlier posts about ensuring that the applications we (and I guess I'm thinking here of the digital library and e-learning communities, but it applies more broadly too) develop are designed so as to be a good "fit" for the Web, I've been trying to follow up some of the contributions to the wider debates about the nature and design of Web applications.

Really, this post isn't much more than a brief collection of jottings - bookmarks and the occasional quotation - but I hope that the resources I point to may be of use to others, and that the points I highlight here might help to clarify some of the arguments I'll make in other posts.

One of the most helpful resources I've come across recently is a presentation by Stefan Tilkov titled "REST - The Better Web Services Model" (both slides and audio are available). It provides a very clear and comprehensive introduction to the "Representational State Transfer" "architectural style", which is an abstraction defined by Roy Fielding to articulate the key architectural priciples underpinning the Web and the HTTP protocol, and to the way features of HTTP implement the REST style. He moves from the general to the specific and discusses how to design a RESTful application, and contrasts this approach with that taken by the "Web Services" family of specifications, concluding on the "provocative" note: "If you only remember one thing, it should be that HTTP is good enough". I found it an excellent presentation and I'd strongly recommend giving it a listen.

The presentation also mentions the notion of REST-based architectures supporting a "resource-oriented approach", and the contrast between a "resource-oriented approach" and a "service-oriented approach" is explored further in a piece by Alex Bunardzic titled "Replacing Service Oriented Architecture with Resource Oriented Architecture". And covering similar ground, Leonard Richardson and Sam Ruby have a note summarising the content of their forthcoming book, REST Web Services:

We want to restore the World Wide Web to its rightful place as a respected architecture for distributed programming. We want to shift the focus of "web service" programming from an RPC-style architecture that just happens to use HTTP as a transfer protocol, to a URI-based architecture that uses the technologies of the web to their fullest.

Richardson and Ruby also highlight that "using REST" is not the same thing as "not using SOAP"; some systems that are loosely labelled as "REST-based" do not in fact respect the constraints of the REST architectural style.

Although I enjoy working with some level of abstraction, more often than not, particularly when I'm coming to something new, the light bulb in my head tends to flicker somewhat intermittently until someone shows me an example. Tilkov's presentation also points to a piece by Joe Gregorio, "How to Create a REST Protocol" which works through the design of a real application and highlights the key steps:

  1. decide on the resources required for your application, and select URIs to identify those resources
  2. decide the formats to be used for representations of resources
  3. decide which of the REST methods/verbs are to be supported by each resource i.e. how do the functional requirements of the application map to the uniform interface specified by REST?
  4. decide which response codes are to be returned by each resource

Gregorio's article includes a pointer to a brief list of potential pitfalls by Paul Prescod, of which I'll only highlight a few here:

4. Do not put actions in URIs.  This follows naturally from the previous point [about URI opacity]. But a particularly pernicious abuse of URIs is to have query strings like "someuri?action=delete". First, you are using GET to do something unsafe. Second, there is no formal relationship between this "action URI" and the "object" URI. After all your "action=" convention is something specific to your application. REST is about driving as many "application conventions" out of the protocol as possible.

5. Services are seldom resources. In a REST design, a "stock quote service" is not very interesting. In a REST design you would instead have a "stock" resources and a service would just be an index of stock resources.

8. Do not worry about protocol independence. There exists only one protocol which supports the proper resource manipulation semantics. If another one arises in the future, it will be easy to keep your same design and merely support the alternate protocol's interface. On the other hand, what people usually mean by "protocol independence" is to abandon resource modelling and therefore abandon both REST and the Web.

OK, that's it for now. More to come as I think a few things though, I promise!

January 05, 2007

Plagiarism awareness and information skills for teachers

Two interesting short reports have been made available by Netskills at the University of Newcastle, following a couple of projects funded by the Eduserv Foundation under our Information Literacy  Initiatives.

The projects developed pilot workshop programmes related to information literacy in the schools sector, specifically covering plagiarism and information skills and targeted at teachers.  The reports describe the workshops that the Netskills team developed and some of the issues they encountered in trying to deliver them to school teachers.

It is perhaps unfortunate that the small scale of the programmes delivered through this work makes it difficult to draw very firm conclusions.  Nonetheless, both reports close with recommendations for further activities, and these seem to make a lot of sense to me.  There is clearly a lot of scope for further work in these areas.

As the reports say at the end:

...the programme has shown that there is a clear interest and pressing need for training, awareness-raising and debate about plagiarism in the schools sector.


...the project has identified a need for a more strategic approach to the staff development of teachers in terms of their information skills, but due to the existing pressures on teachers' time, the value of this development must be clearly demonstrated.

Second Life - numbers vs. utility

The debate about Second Life numbers and the press' desire to believe the Linden Lab hype continues and has been widely discussed and reported (e.g. here, here, here and here).

The debate is quite interesting, not least in documenting the breadth of attitudes around SL, since it encompasses those who think that SL is the best thing since sliced bread and those who think it is the worst thing since sliced bread :-).  However, it seems to me to be largely the wrong debate to be having, at least from the point of view of the education community.

The debate we should be having is about whether SL is able to serve any useful educational function.

And I have to say, having spent a little time in SL over the last few months, I'm somewhat skeptical.  Not that I dislike SL, far from it.  I'm just struggling to see how it will be used in the near future to support real learning activities in any meaningful, large-scale way.  The technology required is too advanced (in hardware terms) for many users.  The ability to dynamically embed external content to clunky.  The technology too closed.  Even the 1 to 1 and group communication / collaboration aspects of SL, the primary area where there does seem to be real potential, is laboured by being chat only.

Don't get me wrong, there is real potential in SL and similar environments.  And my skepticism about its educational usefulness is tempered by an acknowledgement that I don't have much imagination!  The fact that I can't see how SL will be used in the context of learning probably just means that I'm not clever enough to think of how to do it.  That's why I'm happy to see a small amount of Eduserv Foundation money going into research in this area

On a related note, a recent post on the DigitalLibrarian blog questions whether librarians are right to invest time in SL right now, given other demands on their time.  I don't know the answer to that question - though I must admit that I have a lot of sympathy with the sentiments expressed in the article.  It seems to me that any investment in SL at the moment is a risk, though one with potentially valuable outcomes, and needs to be seen as such from the outset.  Whatever the current technology is capable of, things will undoubtedly change pretty rapidly, whether from within Linden Lab or from outside.  The things we learn from SL right now, notably in the area of online collaboration, will hopefully be useful more broadly as we move forward (i.e. the value should be independent of the success of SL itself).

But what SL (or its successors/competitors) desperately needs, as soon as possible IMHO, in order that we can better learn from it, is:

  • mashable (i.e., simple, flexible and dynamic) content integration (a la Web 2.0),
  • proper VoIP integration,
  • open standards-based interfaces and
  • support for large group events (without having to replicate events across multiple islands).

One assumes that these kinds of features are coming...  but I'm not sure when.

Grantscallmeeting As an aside, I did my first presentation in-front of an audience in SL yesterday.  I was asked to talk a little bit about the current grants call to the UK Educators group - a small-scale virtual equivalent of a JISC townhall meeting if you like. Quite a few people turned up and I enjoyed it - though I was at least as nervous at the start as when I give a f2f presentation :-).  Hey, it's not often that you go to a work-related meeting at which a dalek turns up!

I had read Jeff Barr's guide to giving effective presentations in SL before I went, though I have to confess that I failed to use gestures as much as I should.  The question and answer session was pretty hectic and it was useful have Pete Johnston also in world to help keep track of which ones I'd answered and which ones I'd missed.  Overall, I think it went OK.

January 02, 2007


Getting on for a year ago the Eduserv Foundation ran a symposium entitled I before E: identity management, e-portfolios and personalised learning.  The day was based on the premise that some form of identity management is required for e-portfolio services to be rolled out.  The day was quite successful, at least insofar as it allowed a lot of discussion to take place, though one might argue that it wasn't particularly conclusive.

Over the Christmas period an interesting debate has broken out on the CETIS ePortfolio mailing lists around what constitutes an e-portfolio system, and indeed what constitutes an e-portfolio, though I have to confess that I'm not sure I'm any the wiser as a result of the discussions.  The more I see debates about whether an e-portfolio is a representation of (part of) of our identities, the more confused I get.  To my somewhat simplistic way of thinking, an e-portfolio is just a portfolio that happens to be electronic.  (Note: if I said this during an episode of QI I'm sure the alarm bells would be ringing and I'd be on about minus 200 points! :-) ).

As an aside, Scott Wilson has produced a useful looking set of categories of personal information and identity but he leaves open the question of which of them might reasonably form part of an e-portfolio, so that doesn't really help me.

It seems to me that Adam Cooper asks the most pertinent question on the list:

... what is it that learners and teachers might want to do using something that might be described as a portfolio?

As someone that occasionally sits on interview panels for school teachers, I have occasionally been on the receiving end of a physical portfolio of a prospective teacher's work - essentially given by the candidate as evidence that some learning or other activity has taken place.  But it doesn't strike me that a portfolio is really about identity as such - or at least not as I understand it.

Identity is important of course, as is tying any given portfolio to the appropriate identity - which is just as a much a problem in the real world as in the digital world.  When someone hands over their physical or e- portfolio, how do you know that it is their work - a problem that is apparently besetting the GCSE examination system in the UK, at least as it is currently instantiated.

It seems to me then that a portfolio is really just evidence.  And I have naively been asuming that an e-portfolio was the same, but electronic.



eFoundations is powered by TypePad