« February 2007 | Main | April 2007 »

March 31, 2007

What students do...

Martin Poulter summarises a presentation by the guys who run the student residential network at the University of Bristol in a post on the (fairly new?) Ancient Geeks blog.  Worth a quick read for an overview of what students at Bristol do with their Internet connections.  No major surprises here... though I confess that I hadn't heard of either of the first two video streaming services (which simply confirms that I'm pretty out of touch with this kind of thing I guess).

Surveyed about what they want in future from the university network, students say they want video podcasts, or failing that audio, of their lecturers. They don’t want less personal contact with teaching staff, but they want to be able to catch up with lectures on a video iPod on the train. They also want ubiquity: they expect high quality access, wirelessly, everywhere. Having been brought up with Google and Amazon, they have very high standards of ease of use.

Mashup* on Identity 2.0

I note that the next meeting of mashup* will focus on Identity 2.0 (OpenID, Cardspace and others) and features an interesting set of presenters (though there doesn't seem to be a great deal of time on the programme for them??).

Anyway, looks good if you are in London on the 24th April.

JISC, Scribd and scholarly repositories

Tony Hurst asks "why doesn't JISC fund the equivalent of Scribd for the academic community?" in a post on the OUseful blog to which one is tempted to ask, "why would they when such things already exist out on the Web?".

Of course, in reality there are good reasons why they might, partly because of the specific requirements of scholarly documents (as opposed to just any old documents) and partly because of assurances about persistence of services, quality assurance, and so on.

I'm minded to ask a different question.  One that I've asked before on a number of occasions, not least in the context of the current ORE project, which is "why don't scholarly repositories look more like Scribd?".  Why do we continue to develop and use digital library specific solutions, rather than simply making sue that our repositories integrate tightly with the main fabric of the Web (read Web 2.0)?

What does that mean?  Essentially it means assigning 'http' URIs to everything of interest, using the HTTP protocol and content negotiation to serve appropriate representations of that stuff, using sitemaps to steer crawlers to the important information, and using JSON to surface stuff flexibly in other places?

By the way, Tony also asks whether there is any sort of cross-search of UK repositories available, to which the answer is that JISC are funding Intute to develop such a thing (a development of the previous ePrints UK project I think).  And there are the global equivalents such as OAIster.

March 23, 2007

DCMI meetings in Barcelona

Resurantlitoral Pete and I spent the the tail end of last week in Barcelona for a couple of Dublin Core meetings.

On the Thursday, we got together with Tom Baker and Mikael Nilsson to brainstorm application profiles and specifically to think about a UML model for the machine-readable part of an application profile - what we've come to term a Description Set Profile.  More of that later, since although we made a lot of progress during the meeting there is still quite a way to go before we have something ready for general consumption.

On the Friday and Saturday, there was a meeting of the DC Usage Board.  Overall, we had a pretty good meeting and we got through a lot of decision making, helped in part, for me at least, by knowing that the hotel we were staying at had a nice hot tub, pool and sauna to go back to after the meetings!

The first item on the UB agenda was the literal value vs. non-literal value issue that has been raised on the DC-Architecture mailing list during the recent comment period on the domains and ranges proposal.  The issue, briefly, is that it is somewhat problematic in OWL-DL applications for us to define the range of a DC property as being something that can be both a literal and a non-literal - which was exactly what we proposed doing for the new dcterms:title, dcterms:description and dcterms:date properties.

Clearly, one question we need to think about is whether OWL-DL compatibility is important to the DCMI community.  But let's assume for a moment that it is.  If so, then we need to decide one way or another whether the values of the above properties are literals (strings) or non-literals.

Now... readers in the western world are probably thinking to themselves, how can a title be anything other than a literal string? They might even be thinking that treating titles as anything other than literals is so non-intuitive that to do so would be madness.  But interestingly it became very clear during the meeting that for languages that have more than one written form (such as Japanese) treating the title as a non-literal resource, off of which you can hang the multiple written forms, is the only approach that really makes sense.

This is a tricky situation for DCMI, since we are torn between doing things in an intuitive way, at least from the perspective of a large part of the planet, and doing things in a generic enough way to handle all written languages.

On balance, the UB decided to go with the literal approach, but to go back out for another comment period specifically asking implementers in regions with languages that have multiple written forms to consider how they would deal with dcterms:title values being simple literal strings.

One of the other issues that has been taxing the UB of late is trying to categorise the existing list of DCMI-endorsed encoding schemes into syntax encoding schemes and vocabulary encoding schemes.  One might assume that this would be a trivial exercise.  Unfortunately not!  Part of the problem is that the labels we use for our two kinds of encoding schemes - labels that were essentially given to us by the history of DCMI - are no longer very appropriate in the context of the DCMI Abstract Model.

In short, what the Abstract Model tells us is that syntax encoding schemes pertain to value strings, essentially they provide the RDF datatype of the string.  Vocabulary encoding schemes, on the other hand, tell us the set of things of which the value is a member.  At the risk of being somewhat simplistic, syntax encoding schemes define a set of strings, while vocabulary encoding schemes define a set of things.

So even where something smells and tastes just like a vocabulary (as that word is used in common parlance), a good example being the list of language tags defined by RFC 1766, if it defines a set of strings then it is categorised as a syntax encoding scheme rather than a vocabulary encoding scheme.

It takes a while to get your head round this, but once done, everything becomes clearer.

Coincidentally, later in the meeting we came to look at the definitions for all our encoding schemes.  These are somewhat problematic at the moment, because the current definitions simply provide the expanded form of the encoding scheme name - hardly something that can be counted as a good definition!

As we went through the list it suddenly became clear... if we define encoding schemes to be along the lines of either "The set of concepts defined by ..." or the "The set of strings defined by ..." then DCMI's interpretation of other people's systems as being a syntax encoding scheme or a vocabulary encoding scheme would become much easier to grasp.  At least that's the theory!  So, for example, the issue of whether LCSH defines a set of strings or a set of concepts becomes a moot point.  DCMI interprets the dcterms:LCSH encoding scheme to be the set of concepts defined by the Library of Congress Subject Headings.  I.e. we define it to be a vocabulary encoding scheme.

Image: Restaurant Litoral on the Barcelona sea-front.  No, we didn't eat there and yes, I know it's a crap pun!  Sorry.

March 20, 2007

Web 2.0 usage survey

There's a blog entry entitled Some real data on Web 2.0 use by Dave White on the TALL blog.  Looks like an interesting survey of student and staff attitudes and habits around Web 2.0 services.

Like any survey, the devil is in the detail of what was asked, though the report does touch in this to some extent.

Difficult to know what conclusions to draw.  I haven't spent very long looking at it.  Usage of Second Life is pretty low, but then one would expect that I guess.  Usage of del.icio.us is lower than I would have thought, but perhaps that's just from my perspective as someone that just about manages to use it regularly.  The proportion of people reading and writing blogs for work or study purposes is pretty low.  None of these things is particularly surprising - which is not to say that it isn't useful to see them written down in a study like this.

March 18, 2007

JISC Conference 2007

I spent Tuesday at the JISC Conference in Birmingham, on balance quite a pleasant day and certainly an excellent opportunity for networking and meeting up with old friends ('old' as in 'long term' you understand!).

I went to an hour long session about the JISC e-Framework, SOA and Enterprise Architecture in the morning.  I have to say that I was somewhat disappointed by the lack of any mention of Web 2.0.  Err... hello!.  Also by the 30 minute presentation about the Open Group which, I'm afraid to say, struck me as rather inappropriate.

I presented in the afternoon alongside Ed Zedlewski, as part of the Eduserv special session.  Between us we tried to cover the way in which the access and identity management landscape is changing and how Eduserv is responding to that with the new OpenAthens product.  I've put our slides up on SlideShare for those that are interested.

The general thrust of my bit was that end-user needs will push us down a user-centric identity management road, the way academic collaborations are going (for both learning and research) means that institutions will need to operate across multiple access management federations and the technology around this area is in a constant state of change.  All of this, I argued, means that institutions would do well to consider outsourced access and identity management solutions rather than developing solutions in-house.  There are, of course, also good reasons for doing stuff in-house, so there's a certain amount of horses for courses here - but outsourcing should definitely be on the agenda for discussion.

The day ended with an excellent closing keynote by Tom Loosemore from the BBC.  Tom presented 15 key principles of Web design, a relatively simple concept but one that was ultimately quite powerful. Tom's principles were very much in the spirit of Web 2.0 and just the kinds of things that Brian Kelly and others have been banging on about for ages, but it was nice to hear the same messages coming from outside the community.

The 15 principles were as follows:

  1. focus on the needs of the end-user
  2. less is more
  3. do not attempt to do everything yourself
  4. fall forwards fast: try lots of things, kill failures quickly
  5. treat the entire Web as a creative canvas
  6. the Web is a conversation: join as a peer and admit mistakes when necessary
  7. any Web site is only as good as its worst page
  8. make sure that all your content can be linked to forever: the link is the heart of the Web
  9. remember that your granny won't ever use Second Life
  10. maximise routes to content: optimise to rank high in Google
  11. consistent design and navigation doesn't mean that one size fits all
  12. accessibility is not an optional extra
  13. let people paste your content on the walls of their virtual homes
  14. link to discussions on the Web, don't host them
  15. personalisation should be unobtrusive, elegant and transparent

Apologies to Tom if I've mis-quoted any of these.  Each was illustrated with some nice case studies taken from the BBC and elsewhere.

If I disagree with anything it's with the ordering.  As you might expect from previous postings to this blog, I'd put number 8 much higher up the list.  But overall, I think these are a good set of principles that people would do well to take note of.

15 principles too many for you?  Try 5...  Web sites should be:

  • straightforward
  • functional
  • gregarious
  • open
  • evolving.

March 14, 2007

Virtual worlds, real learning? - update

Some of you may have noticed that we have now formally announced this year's Eduserv Foundation symposium.  A programme for the day is now available (though it should be noted that the titles of talks are indicative rather than actual).  As I said in my last post, I think we have a very exciting line-up of speakers including Jim Purbrick from Linden Lab, Roo Reynolds from IBM, Hamish MacLeod from Edinburgh University, Joanna Scott from Nature Publishing Group, Gilly Salmon from the University of Leicester and Stephen Downes from the National Research Council Canada.

The day will end with a panel session that includes all the speakers and Sara de Freitas from the London Knowledge Lab (who recently authored the Learning in Immersive Worlds study for the JISC).

There is already a lot of interest in attending this event.  If you want to come but haven't yet signed up, then I suggest that you do so soon.

March 09, 2007

Eduserv 'bubble' in IWR


I'm trying to limit my postings about Second Life to my other blog but just wanted to note that the Eduserv MeetingPod (a 'floating bubble'!) got a mention in a feature about SL in the current issue of Information World Review (March 2007), also available on the IT Week site.

March 08, 2007

Virtual worlds, real learning?

We're very close to being able to announce the programme for this year's Eduserv Foundation Symposium, "Virtual worlds, real learning?", to be held at the Congress Centre in London on the 10th May.  I'm very excited about the speakers we have lined up... but more of that in due course.

We'll announce the programme early next week, at which point a booking form and so on will be available.

March 02, 2007

What is a Dublin Core Application Profile, really?

The notion that metadata standards are tailored by their implementers to meet the requirements of some particular context and that this contextualisation might be captured in the form of a "profile" was one of the ideas explored within the European Commission-funded DESIRE project back in 1998-2000. It was brought to a broader audience through the widely-cited Ariadne article "Application profiles: mixing and matching metadata schemas" by Rachel Heery and Manjula Patel of UKOLN. For several years now, the Dublin Core Metadata Initiative and DC implementers have used the notion of the DC Application Profile (DCAP) - though so far DCMI hasn't really developed a precise statement of what a DC Application Profile really is! One of the items on the "roadmap" of development work for the DCMI Architecture Forum and the DCMI Usage Board presented at the DC-2006 conference seeks to address this question with the development of a model for a DCAP.

Despite this absence of a formal definition, many DC implementers (and indeed groups within DCMI) have developed specifications recognised as "application profiles". They typically take the form of human-readable documents providing annotated lists of named metadata terms (or permutations of terms) used in DC metadata so as to meet some shared set of goals or requirements within a defined context (e.g. a particular application, some specific set of systems that are exchanging metadata, or some broader domain or community). Further, several initiatives have developed software tools that work with the concept of the DCAP. These include metadata registries that include application profiles as one of the types of resource about which they capture and disclose information; "profile authoring" tools which allow a metadata designer to create a description of their application profile; and "instance authoring" tools which use that description of a profile, e.g. to configure a form for the editing of metadata instances.

The designers/developers of these various tools have had to provide some answer to that question of "what a DC Application Profile is": they have had to develop a model for a DCAP - and then choose a means of representing instances of that model in a machine-processable form. The problem has been, of course, that they have each chosen (at least slightly) different models - and in some cases the same designers/developers used different models over time. I can say this because I was one of them! In my previous life at UKOLN, I contributed to a number of projects which sought to develop prototype metadata registries, building in part on the work of the DESIRE project. One of the central concepts we used was the notion of a DCAP as a set of what in the model for the JISC IE Metadata Schema Registry we called "property usages" (a rather less than elegant compound noun, but the best we could come up with at the time!) - a description of how a named property was deployed or "used" in DC metadata in some application.

Most of this work pre-dated the development, or at least the finalisation, of the DCMI Abstract Model (DCAM). The DCAM tells us that "DC metadata" takes the form of information structures which it calls description sets, which contain descriptions, which in turn contain statements. This, I think, helps elucidate the real purpose/nature of a DCAP, and particularly what is meant by the notion of "using" a property. If the "units" of DC metadata are description sets, then a DCAP is a specification for the construction of some class of description sets - and arguably what we have called a DCAP might be better labelled a "description set profile" (or maybe a "description set template" or "description set pattern"?). And since a description set is made up of multiple descriptions, possibly of different types of resource with different characteristics, a DCAP is also structured to provide information about the descriptions of those different resource types. For example, the Eprints DCAP that we've referred to in some previous posts here specifies how to construct descriptions of what it calls Scholarly Works, Expressions, Manifestations, Copies and Agents. So the DCAP (the "description set profile/template") might be conceptualised as consisting of a set of "description profiles" (templates, patterns). A description is composed of a set of statements, each of which must contain a reference to a property via its URI, and may contain references to metadata terms of other types (vocabulary encoding schemes, syntax encoding schemes), and may contain references to or representations of a value. So the role of what the IEMSR model calls a "property usage" is to provide a specification of how to construct a statement; it is a "statement profile" or "statement template".

While the DCAM specifies the types of component that make up a description set and the relationships between those components, it does not specify that the statements within a description set should refer to any particular set of terms (any specific set of properties, classes, vocabulary encoding schemes, syntax encoding schemes), and it does not specify when statements should provide explicit references to values or when they should provide representations of values. That information is specific to some system or domain or community, and it is that level of shared interest which is addressed by a DCAP - though of course it may be the case that that domain or community is extremely general and broad, as might be argued is the case for the use of the "Simple DC" DCAP.

I think this approach of the DCAP-as-template/pattern dovetails with the approach described by Dan Brickley and extended by Alistair Miles in his Schemarama 2 for the checking of RDF graphs. Dan's post examines the relationship between querying an RDF graph and checking for patterns in the graph, in a fashion more or less analogous to the way Schematron works with patterns in XML trees. Alistair's approach uses the CONSTRUCT feature of the SPARQL RDF query language to generate a "report" based on querying for patterns in a graph, in much the way Schematron uses XPath. And indeed I should acknowledge that much of what I've written here has been influenced by Dan and Alistair's ideas -  with the distinction that I've portrayed the DCAP as being defined with reference to the structure of the DC description set rather than the RDF graph.

Taking a step back from this level of detail, it's also worth noting that - as I think the work on the Eprints DCAP illustrates quite clearly - a DCAP-as-template/pattern exists as, and indeed is created as, one component of a set of closely related resources, including a "domain model" or "application model" which specifies the types of "things in the world" (and their relationships) for which the DCAP specifies how to create DC descriptions - in the case of the Eprints DCAP, a specialisation of the FRBR model - and a specification of how description sets are to be encoded using one or more concrete syntaxes (which in many cases may be simply a reference to an existing DCMI specification).

One last point: one of the other ideas I heard mentioned at DC-2006 was that of a "DCAP module", a specification for DC metadata which, rather than describing how to construct a complete description set, describes how to construct some subset of a description set in order to support functions that are generic to many different applications and domains - the work of the DCMI Accessibity Task Group being a good example - so that it can be referenced/re-used by many different DCAPs. Again, I think the template/pattern-oriented approach would be compatible with this notion: a "DCAP module" could be a set of partial description profiles which could be imported into other DCAPs. Hmmm, I imagine there may be some non-trivial issues there to do with precedence to work through... but enough already, as they say.

P.S. I guess I should just add that the represents only my own "thinking out loud", not any DCMI-endorsed view!



eFoundations is powered by TypePad