« November 2009 | Main | January 2010 »

December 22, 2009

Online learning in virtual environments with SLOODLE - final report

The final report from the Online Learning In Virtual Environments with SLOODLE project, led by Dan Livingston of the University of the West of Scotland, is now available.  SLOODLE was one of the Second Life projects that we funded back in 2007, following a call for proposals in November 2006.  Seems like a long time ago now!

Reading the report, it is clear that the project became as much about building a community of SLOODLE users as it was about developing some open source software - which, of course, that is how all good open source projects should be, but it doesn't always work out like that.  In this case however, I think the project has been very successful and the numbers on page 4 of the report give some evidence of that.

I must admit that I have always had a nagging doubt about the sense of bringing together the kind of semi-formalised learning environment that is typical of VLEs such as Moodle with the, shall we say anarchic(?), less structured learning opportunities presented by virtual worlds in general and Second Life in particular.  To a certain extent I think the project mitigated this by developing a wide-ranging set of tools, some of which are tightly integrated with Moodle and some of which are stand-alone.  Whatever... one of the things that I really like about the report is the use of User Stories towards the end.  It's clear is that this stuff works for people.

And so to the future.  As Dan says in the Forward to the report:

Although the Eduserv project has now come to an end, SLOODLE continues to keep me busy – with regular conference and workshop presentations in both physical and virtual form. Community support and development remains as important today as it was, and can now be even more challenging – with SLOODLE tools now available on multiple virtual world platforms, and with the approach of large scale installations on university faculty and central Virtual Learning Environments.

Dan, along with representatives of all the other Second Life/Virtual World projects we funded 2 years ago, will be speaking at our Where next for virtual worlds in UK higher and further education? event at the London Knowledge Lab next year (now sold out I'm afraid).

December 21, 2009

Scanning horizons for the Semantic Web in higher education

The week before last I attended a couple of meetings looking at different aspects of the use of Semantic Web technologies in the education sector.

On the Wednesday, I was invited to a workshop of the JISC-funded ResearchRevealed project at ILRT in Bristol. From the project weblog:

ResearchRevealed [...] has the core aim of demonstrating a fine-grained, access controlled, view layer application for research, built over a content integration repository layer. This will be tested at the University of Bristol and we aim to disseminate open source software and findings of generic applicability to other institutions.

ResearchRevealed will enhance ways in which a range of user stakeholder groups can gain up-to-date, accurate integrated views of research information and thus use existing institutional, UK and potentially global research information to better effect.

I'm not formally part of the project, but Nikki Rogers of ILRT mentioned it to me at the recent VoCamp Bristol meeting, and I expressed a general interest in what they were doing; they were also looking for some concrete input on the use of Dublin Core vocabularies in some of their candidate approaches.

This was the third in a series of small workshops, attended by representatives of the project from Bristol, Oxford and Southampton, and the aim was to make progress on defining a "core Research ontology". The morning session circled mainly around usage scenarios (support for the REF (and other "impact" assessment exercises), building and sustaining cross-institutional collaboration etc), and the (somewhat blurred) boundaries between cross-institutional requirements and institution-specific ones; what data might be aggregated, what might be best "linked to"; and the costs/benefits of rich query interfaces (e.g. SPARQL endpoints) v simpler literal- or URI-based lookups. In the afternoon, Nick Gibbins from the University of Southampton walked through a draft mapping of the CERIF standard to RDF developed by the dotAC project. This focused attention somewhat and led to some - to me - interesting technical discussions about variant ways of expressing information with differing degrees of precision/flexibility. I had to leave before the end of the meeting, but I hope to be able to continue to follow the project's progress, and contribute where I can.

A long train journey later, the following day I was at a meeting in Glasgow organised by the CETIS Semantic Technologies Working Group to discuss the report produced by the recent JISC-funded Semtech project, and to try to identify potential areas for further work in that area by CETIS and/or JISC. Sheila MacNeill from CETIS liveblogged proceedings here. Thanassis Tiropanis from the University of Southampton presented the project report, with a focus on its "roadmap for semantic technology adoption". The report argues that, in the past, the adoption of semantic technologies may have been hindered by a tendency towards a "top-down" approach requiring the widespread agreement on ontologies; in contrast the "linked data" approach encourages more of a "bottom-up" style in which data is first made available as RDF, and then later application-specific or community-wide ontologies are developed to enable more complex reasoning across the base data (which may involve mapping that initial data to those ontologies as they emerge). While I think there's a slight risk of overstating the distinction - in my experience many "linked data" initiatives do seem to demonstrate a good deal of thinking about the choice of RDF vocabularies and compatibility with other datasets - and I guess I see rather more of a continuum, it's probably a useful basis for planning. The report recommends a graduated approach which focusses initially on the development of this "linked data field" - in particular where there are some "low-hanging fruit" cases of data already made available in human-readable form which could relatively easily be made available in RDF, especially using RDFa.

One of the issues I was slightly uneasy with in the Glasgow meeting was that occasionally there were mentions of delivering "interoperability" (or "data interoperability") without really saying what was meant by that - and I say this as someone who used to have the I-word in my job title ;-) I feel we probably need to be clearer, and more precise, about what different "semantic technologies" (for want of a better expression) enable. What does the use of RDF provide that, say, XML typically doesn't? What does, e.g., RDF Schema add to that picture? What about convergence on shared vocabularies? And so on. Of course, the learners, teachers, researchers and administrators using the systems don't need to grapple with this, but it seems to me such aspects do need to be conveyed to the designers and developers, and perhaps more importantly - as Andy highlighted in his report of related discussions at the CETIS conference - to those who plan and prioritise and fund such development activity. (As an aside, I this is also something of an omission in the current version of the DCMI document on "Interoperability Levels": it tells me what characterises each level, and how I can test for whether an application meets the requirements of the level, but it doesn't really tell me what functionality each level provides/enables, or why I should consider level n+1 rather than level n.)

Rather by chance, I came across a recent presentation by Richard Cyganiak to the Vienna Linked Data Camp, which I think addresses some similar questions, albeit from a slightly different starting point: Richard asks the questions, "So, if we have linked data sources, what's stopping the development of great apps? What else do we need?", and highlights various dimensions of "heterogeneity" which may exist across linked data sources (use of identifiers, differences in modelling, differences in RDF vocabularies used, differences in data quality, differences in licensing, and so on).

Finally, I noticed that last Friday, Paul Miller (who was also at the CETIS meeting) announced the availability of a draft of a "Horizon Scan" report on "Linked Data" which he has been working on for JISC, as part of the background for a JISC call for projects in this area some time early in 2010. It's a relatively short document (hurrah for short reports!) but I've only had time for a quick skim through. It aims for some practical recommendations, ranging from general guidance on URI creation and the use of RDFa to more specific actions on particular resources/datasets. And here I must reiterate what Paul says in his post - it's a draft on which he is seeking comments, not the final report, and none of those recommendations have yet been endorsed by JISC. (If you have comments on the document, I suggest that you submit them to Paul (contact details here or comment on his post) rather than commenting on this post.)

In short, it's encouraging to see the active interest in this area growing within the HE sector. On reading Paul's draft document, I was struck by the difference between the atmosphere now (both at the Semtech meeting, and more widely) and what Paul describes as the "muted" conclusions of Brian Matthews' 2005 survey report on Semantic Web Technologies for JISC Techwatch. Of course, many of the challenges that Andy mentioned in his report of the CETIS conference session remain to be addressed, but I do sense that there is a momentum here - an excitement, even - which I'm not sure existed even eighteen months ago. It remains to be seen whether and how that enthusiasm translates into applications of benefit to the educational community, but I look forward to seeing how the upcoming JISC call, and the projects it funds, contribute to these developments.

December 14, 2009

Internet Identity Workshop news

A series of notes taken at the recent Internet Identity Workshop (IIW) #9 held in Mountain View are now available. I was hoping to attend but in the end wasn't able to.

The extent and detail of the notes varies across different sessions but they are probably worth a look for those with an interest in this area.

Note also that the dates for IIW #10 have been announced, again in Mountain View, as Tuesday 18 May thru to Thurs 20 May 2010.

December 10, 2009

Please update your privacy settings

Still interested in Facebook?

If so, remember that the privacy settings are in the process of being changed, with the default settings being much more public than before.

It would have been nice if they'd opted to leave things as they are by default and let people open things up gradually but it isn't too hard to change from their suggested 'public' settings to what you had before (10 clicks to be precise).  How many people actually do it remains to be seen of course.

I know it's fashionable to paint Facebook as the bad guys but I actually think they try quite hard to make it clear who can see what.

December 08, 2009

UK government’s public data principles

The UK government has put down some pretty firm markers for open data in it's recent document, Putting the Frontline First: smarter government. The section entitled Radically opening up data and promoting transparency sets out the agenda as follows:

  1. Public data will be published in reusable, machine-readable form
  2. Public data will be available and easy to find through a single easy to use online access point (http://www.data.gov.uk/)
  3. Public data will be published using open standards and following the recommendations of the World Wide Web Consortium
  4. Any 'raw' dataset will be represented in linked data form
  5. More public data will be released under an open licence which enables free reuse, including commercial reuse
  6. Data underlying the Government's own websites will be published in reusable form for others to use
  7. Personal, classified, commercially sensitive and third-party data will continue to be protected.

(Bullet point numbers added by me.)

I'm assuming that "linked data" in point 4 actually means "Linked Data", given reference to W3C recommendations in point 3.

There's also a slight tension between points 4 and 5, if only because the use of the phrase, "more public data will be released under an open licence", in point 5 implies that some of the linked data made available as a result of point 4 will be released under a closed licence.  One can argue about whether that breaks the 'rules' of Linked Data but it seems to me that it certainly runs counter to the spirit of both Linked Data and what the government says it is trying to do here?

That's a pretty minor point though and, overall, this is a welcome set of principles.

Linked Data, of course, implies URIs and good practice suggests Cool URIs as the basic underlying principle of everything that will be built here.  This applies to all government content on the Web, not just to the data being exposed thru this particular initiative.  One of the most common forms of uncool URI to be found on the Web in government circles is the technology-specific .aspx suffix... hey, I work for an organisation that has historically provided the technology to mint a great deal of these (though I think we do a better job now).  It's worth noting, for example, that the two URIs that I use above to cite the Putting the Frontline First document both end in .aspx - ironic huh?

I'm not suggesting that cool URIs are easy, but there are some easy wins and getting the message across about not embedding technology into URIs is one of the easier ones... or so it seems to me anyway.

December 04, 2009

Moving beyond the typical 15% deposit level

In an email to the AMERICAN-SCIENTIST-OPEN-ACCESS-FORUM@LISTSERVER.SIGMAXI.ORG mailing list, Steve Hitchcock writes:

... authors of research papers everywhere want "to reach the eyes and minds of peers, fellow esoteric scientists and scholars the world over, so that they can build on one another's contributions in that cumulative. collaborative enterprise called learned inquiry."

[This] belief was founded on principle, but also on observed practice, that in 1994 we saw authors spontaneously making their papers available on the Web. From those small early beginnings we just assumed the practice would grow. Why wouldn't it? The Web was new, and open, and people were learning quickly how they could make use of it. Our instincts about the Web were not wrong. Since then, writing to the Web has become even easier.

So this is the powerful idea ..., and what we haven't yet understood is why, beyond the typical 15% deposit level, self-archiving does not happen without mandates. The passage of 15 years should tell us something about the other 85% of authors. Do they not share this belief? Does self-archiving not serve the purpose? ...

This is the part that needs to be re-examined, the idea, and why it has yet to awaken and enthuse our colleagues, as it has us, to the extent we envisaged. Might we have misunderstood and idealised the process of 'learned inquiry'?

I completely agree.

In passing, I'd be interested to know what uptake of Mendeley is like, and whether it looks likely to make any in-roads into the 85%, either as an adjunct to institutional repositories or as an alternative?

December 03, 2009

On being niche

I spoke briefly yesterday at a pre-IDCC workshop organised by REPRISE.  I'd been asked to talk about Open, social and linked information environments, which resulted in a re-hash of the talk I gave in Trento a while back.

My talk didn't go too well to be honest, partly because I was on last and we were over-running so I felt a little rushed but more because I'd cut the previous set of slides down from 119 to 6 (4 really!) - don't bother looking at the slides, they are just images - which meant that I struggled to deliver a very coherent message.  I looked at the most significant environmental changes that have occurred since we first started thinking about the JISC IE almost 10 years ago.  The resulting points were largely the same as those I have made previously (listen to the Trento presentation) but with a slightly preservation-related angle:

  • the rise of social networks and the read/write Web, and a growth in resident-like behaviour, means that 'digital identity' and the identification of people have become more obviously important and will remain an important component of provenance information for preservation purposes into the future;
  • Linked Data (and the URI-based resource-oriented approach that goes with it) is conspicuous by its absence in much of our current digital library thinking;
  • scholarly communication is increasingly diffusing across formal and informal services both inside and outside our institutional boundaries (think blogging, Twitter or Google Wave for example) and this has significant implications for preservation strategies.

That's what I thought I was arguing anyway!

I also touched on issues around the growth of the 'open access' agenda, though looking at it now I'm not sure why because that feels like a somewhat orthogonal issue.

Anyway... the middle bullet has to do with being mainstream vs. being niche.  (The previous speaker, who gave an interesting talk about MyExperiment and its use of Linked Data, made a similar point).  I'm not sure one can really describe Linked Data as being mainstream yet, but one of the things I like about the Web Architecture and REST in particular is that they describe architectural approaches that haven proven to be hugely successful, i.e. they describe the Web.  Linked data, it seems to me, builds on these in very helpful ways.  I said that digital library developments often prove to be too niche - that they don't have mainstream impact.  Another way of putting that is that digital library activities don't spend enough time looking at what is going on in the wider environment.  In other contexts, I've argued that "the only good long-term identifier, is a good short-term identifier" and I wonder if that principle can and should be applied more widely.  If you are doing things on a Web-scale, then the whole Web has an interest in solving any problems - be that around preservation or anything else.  If you invent a technical solution that only touches on scholarly communication (for example) who is going to care about it in 50 or 100 years - answer, not all that many people.

It worries me, for example, when I see an architectural diagram (as was shown yesterday) which has channels labelled 'OAI-PMH', XML' and 'the Web'!

After my talk, Chris Rusbridge asked me if we should just get rid of the JISC IE architecture diagram.  I responded that I am happy to do so (though I quipped that I'd like there to be an archival copy somewhere).  But on the train home I couldn't help but wonder if that misses the point.  The diagram is neither here nor there, it's the "service-oriented, we can build it all", mentality that it encapsulates that is the real problem.

Let's throw that out along with the diagram.

December 01, 2009

On "Creating Linked Data"

In the age of Twitter, short, "hey, this is cool" blog posts providing quick pointers have rather fallen out of fashion, but I thought this material was worth drawing attention to here. Jeni Tennison, who is contributing to the current work with Linked Data in UK government, has embarked on a short series of tutorial-style posts called "Creating Linked Data", in which she explains the steps typically involved in reformulating existing data as linked data, and discusses some of the issues arising.

Her "use case" is the scenario in which some data is currently available in CSV format, but I think much of the discussion could equally be applied to the case where the provider is making data available for the first time. The opening post on the sequence ("Analysing and Modelling") provides a nice example of working through the sort of "things or strings?" questions which we've tried to highlight in the context of designing DC Application Profiles. And as Jeni emphasises, this always involves design choices:

It’s worth noting that this is a design process rather than a discovery process. There is no inherent model in any set of data; I can guarantee you that someone else will break down a given set of data in a different way from you. That means you have to make decisions along the way.

And further on in the piece, she rationalises her choices for this example in terms of what those choices enable (e.g. "whenever there’s a set of enumerated values it’s a good idea to consider turning them into things, because to do so enables you to associate extra information about them").

The post on URI design offers some tips, not only on designing new URIs but also on using existing URIs where appropriate: I admit I tend to forget about useful resources like placetime.com "a URI space containing URIs that represent places and times" (and provides redirects to descriptions in various formats).

On a related note, the post on choosing/coining properties, classes and datatypes includes a pointer to the OWL Time ontology. This is something I was aware of, but only looked at in any detail relatively recently. At first glance it can seem rather complex; Ian Davis has a summary graphic which I found helpful in trying to get my head round the core concepts of the ontology.

It seems to me these sort of very common areas like time data are those around which some shared practice will emerge, and articles like these, by "hands-on" practitioners, are important contributions to that process.

An increasingly common Twitter/OAuth scenario...

Twitter/OAuth challenge

An application would like to connect to your (Twitter) account?

Yeah, I know, I just clicked on the link, right?

The application _blah_ by _blah_ would like the ability to access and update your data on Twitter.

Err... OK. But why does it need access to update my data?

This application plans to use Twitter for logging you in in the future.

That's what I figured! But I still don't understand why it needs access to update my data? I think I'll pass... I'm not sure I want random applications being able to tweet on my behalf.

End of story :-(

The point is that there is a trust issue here and I don't think that current implementations are helping people to make sensible decisions. Why does the application need to update my data on Twitter? In this case, there appears to be a perfectly valid reason as far as I can tell, but even so...

  • What kinds of updates is it going to make?
  • How often is it going to make them?
  • Are any updates going to be under my control?

I just want to have some indication of these kinds of things before I click on the 'Allow' button. Thanks.



eFoundations is powered by TypePad