« March 2009 | Main | May 2009 »

April 24, 2009

More RDFa in UK government

It's quite exciting to see various initiatives within UK government starting to make use of Semantic Web technologies, and particularly of RDFa. At the recent OKCon conference, I heard Jeni Tennison talk about her work on using RDFa in the London Gazette. Yesterday, Mark Birbeck published a post outlining some of his work with the Central Office of Information.

The example Mark focuses on is that of a job vacancy, where RDFa is used to provide descriptions of various related resources: the vacancy, the job for which the vacancy is available, a person to contact, and so on. Mark provides an example of a little display app built on the Yahoo SearchMonkey platform which processes this data.

As a a footnote (a somewhat lengthy one, now that I've written it!), I'd just draw attention to Mark's description of developing what he calls an RDF "argot" for constructing such descriptions:

The first vocabularies -- or argots -- that I defined were for job vacancies, but in order to make the terminology usable in other situations, I broke out argots for replying to the vacancy, the specification of contact details, location information, and so on.

An argot doesn't necessarily involve the creation of new terms, and in fact most of the argots use terms from Dublin Core, FOAF and vCard. So although new terms have been created if they are needed, the main idea behind an argot is to collect together terms from various vocabularies that suit a particular purpose.

I was struck by some of the parallels between this and DCMI's descriptions of developing what it calls an "DC application profile" - with the caveat that DCMI typically talks in terms of the DCMI Abstract Model rather than directly of the RDF model. e.g. the Singapore Framework notes:

In a Dublin Core Application Profile, the terms referenced are, as one would expect, terms of the type described by the DCMI Abstract Model, i.e. a DCAP describes, for some class of metadata descriptions, which properties are referenced in statements and how the use of those properties may be constrained by, for example, specifying the use of vocabulary encoding schemes and syntax encoding schemes. The DC notion of the application profile imposes no limitations on whether those properties or encoding schemes are defined and managed by DCMI or by some other agency

And in the draft Guidelines for Dublin Core Application Profiles:

the entities in the domain model -- whether Book and Author, Manifestation and Copy, or just a generic Resource -- are types of things to be described in our metadata. The next step is to choose properties for describing these things. For example, a book has a title and author, and a person has a name; title, author, and name are properties.

The next step, then, is to scan available RDF vocabularies to see whether the properties needed already exist. DCMI Metadata Terms is a good source of properties for describing intellectual resources like documents and web pages; the "Friend of a Friend" vocabulary has useful properties for describing people. If the properties one needs are not already available, it is possible to declare one's own

And indeed the Job Vacancy argot which Mark points to would, I think, probably be fairly recognisable to those familiar with the DCAP notion: compare, for example, with the case of the Scholarly Works Application Profile. The differences are that (I think) an "argot" focuses on the description of a single resource type, and I don't think it goes as far as a formal description of structural constraints in quite the same way DCMI's Description Set Profile model does.

Investigation into the management of website content in higher education institutions

I'm very pleased to announce that work has now started on a short study looking at the issues around the management of website content in higher education institutions. Full details are available on the website so I won't repeat them here. The work is being undertaken by the Social Issues Research Centre (SIRC) on our behalf and will culminate in an openly available report (released under Creative Commons). We also plan to run an interactive session at the next Institutional Web Management Workshop in Essex in July, tentatively entitled Care in the community... how do you manage your Web content?

April 07, 2009

Rough consensus and running code

A link to How the Internet Got Its Rules is doing the rounds on Twitter at the moment and I should perhaps just retweet it and move on but it seems more significant than that.  Yesterday, the RFC (Request for Comments) documents that have underpinned standards-making on the Internet for as long as I can remember were 40 years old.

I'm sorry to say that I didn't hold a personal celebration, despite the fact that RFCs have had a significant impact in one way or another on most of my professional life.

This was the ultimate in openness in technical design and that culture of open processes was essential in enabling the Internet to grow and evolve as spectacularly as it has. In fact, we probably wouldn’t have the Web without it.


Of course, the process for both publishing ideas and for choosing standards eventually became more formal. Our loose, unnamed meetings grew larger and semi-organized into what we called the Network Working Group. In the four decades since, that group evolved and transformed a couple of times and is now the Internet Engineering Task Force. It has some hierarchy and formality but not much, and it remains free and accessible to anyone.

The R.F.C.’s [sic] have grown up, too. They really aren’t requests for comments anymore because they are published only after a lot of vetting. But the culture that was built up in the beginning has continued to play a strong role in keeping things more open than they might have been. Ideas are accepted and sorted on their merits, with as many ideas rejected by peers as are accepted.

There is no doubt that RFCs, and the open approach to "consensus and running code" that thrived on them, have left a significant legacy but somehow the world feels different now.  As the number of RFCs has grown it has become less clear what status any individual RFC has, even within the community that might have notionally led to it being written, and the processes and workflows associated with their development and maintenance seem unclear (from the point of view of the reader at least).  Ultimately, RFCs have to exist in the world of ISO and NISO and IEEE and W3C and OASIS and probably a lot more besides, each of which has some role to play in the wider landscape of standards-making activity and each of which has a different profile and makeup,

A case in point is the topic of my last post, the W3C Social Web Incubator Group, which has so far spent significantly more time talking about openness (who is allowed to take part in the group) and process (how should the group's activities be structured) than it has about the actual topic at hand.  This is not surprising.  The W3C needs, quite rightly, to balance openness against membership - ultimately, it has to be viable as an organisation as well as remaining both credible and relevant.

And to make matters worse, there are those things that seem to spring up out of nowhere, RSS and OpenID for example, not to mention the de facto world of Microsoft, Google, Amazon, Facebook, Linden Lab and so on.

All in all, we live in confusing times in terms how we best encourage consensus to emerge and the role of the RFC in that space no longer seems as clear as it might once have done.  Nonetheless, the world would be a significantly worse place without them and on that basis, here's wishing the RFC a belated happy birthday.

OKCon 2009

While I probably do spend longer than is healthy in front of a PC on a typical weekend, I have to admit to a fairly high level of resistance to attending "work-related" events at weekends, especially if travel is involved. My Saturdays are for friends, footy, films, & music, possibly accompanied by beer, ideally in some combination.

But (in the absence of any proper football) I temporarily suspended the SafFFFM rule the weekend before last and attended the Open Knowledge Conference, held at UCL. The programme was a mix of themed presentation sessions and an "Open Spaces" session based on contributions from attendees.

The morning session featured three presentations from people working in the development/aid sector. Mark Charmer talked about AKVO, and its mission to the facilitate connections between funders and projects in the area of water and sanitation, and to streamline reporting by projects (through support for submissions of updates by SMS). Vinay Gupta described the use of wiki technology to build Appropedia, a collection of articles on "appropriate technology" and related aid/development issues, including project histories and detailed "how-to"-style information. The third session was a collaboration between Karin Christiansen, on the Publish What You Fund campaign to promote greater access to information about aid, and Simon Parrish on the work of Aidinfo to develop standards for the sharing of such information.

One recurring theme in these presentations was that of valuable information - from records of practical project experience "on the ground" to records of funding by global agencies - being "locked away" from, or at least only partially accessible to, the parties who would most benefit from it. The other fascinating (to me, at least) element was the emphasis on the growing ubiquity of mobile technology: while I'm accustomed to this in the UK, I was still quite taken aback by the claim (I think, by Mark) that in the near future there will be large sections of the world's population who have access to a mobile phone, but not to a toilet.

The main part of the day was dedicated to the "Open Spaces" session of short presentations. Initially, IIRC, these had been programmed as two parallel sessions in which the speakers were allocated 10 minutes each. On the day, the decision was taken to merge them into a single session with (nearly 20, I think?) speakers delivering very short "lightning" talks. We were offered the opportunity to vote on this, I hasten to add, and at the time avoiding missing out on contributions had seemed like a Good Idea, if time permitted. But with hindsight, I'm not sure it was the right choice: it led to a situation in which speakers had to deliver their content in less time than they had anticipated (and some adjusted better than others), there was little time for discussion, and the pace and diversity of the contributions, some slightly technical, but mostly focusing more on social/cultural aspects, did make it rather difficult for me to identify common threads.

The next slot was dedicated to the relationship between Open Data and Linked Data and the Semantic Web, with short, largely non-technical, presentations by Tom Scott of the BBC, Jeni Tennison, and Leigh Dodds of Talis. Maybe it was just because I was familiar with the topic, but it felt to me that this part of the day worked well, and the cohesive theme enabled speakers to build on each other's contributions.

I thought Tom's presentation of the BBC's work on linked data was one of the best I've seen on that topic: he managed to cover a range of technical topics in very accessible terms, all in fifteen minutes. (I see Tom has posted his slides and notes on his weblog.) Jeni described her work with RDFa on the London Gazette. Leigh pursued an aquatic metaphor for RDF - triple as recombinant molecule - and semantic web applications, and also announced the launch of a Talis data hosting scheme which they are calling the Talis Connected Commons, under which public domain datasets of up to 50 million triples can be hosted for free on the Talis Platform. (I noticed this also got an enthusiastic write-up on Read Write Web).

Although I quite enjoyed the linked data talks, it's probably true to say that - Leigh's announcement aside - they didn't really introduce me to anything I didn't know already - but there again, I probably wasn't the primary target audience.

The day ended with a presentation by David Bollier, author of Viral Spiral, on the "sharing economy". Unfortunately, things were over-running slightly at that point, and I only caught the first few minutes before I had to leave for my train home - which was a pity as I think that session probably did consolidate some of the issues related to business models which had been touched on in some of the short talks.

Overall, I suppose I came away feeling the event might have benefited from a slightly tighter focus, maybe building around the content of the two themed sessions. Having said that, I recognise that the call for contributions had been explicitly very "open", and the event did attract a very mixed audience, many probably with quite different expectations from my own! :-)

W3C launches Social Web Incubator Group

The W3C has launched a Social Web Incubator Group, chaired jointly by Dan Appelquist (Vodafone), Dan Brickley (Vrije Universiteit) and Harry Halpin (W3C Fellow from the University of Edinburgh) and I'm very pleased to note that Harry Halpin's contribution to this activity is supported by Eduserv through the Assisting the W3C in opening social networking data project funding that we made available late last year.

The group's mission is to "understand the systems and technologies that permit the description and identification of people, groups, organizations, and user-generated content in extensible and privacy-respecting ways".



eFoundations is powered by TypePad