« Where does your digital identity want to go today? | Main | Memento and negotiating on time »

November 20, 2009

COI guidance on use of RDFa

Via a post from Mark Birbeck, I notice that the UK Central Office for Information has published some guidelines called Structuring information on the Web for re-usability which include some guidance on the use of RDFa to provide specific types of information, about government consultations and about job vacancies.

This is exciting news as, as far as I know, this is the first formal document from UK central government to provide this sort of quite detailed, resource-type-specific guidance with recommendations on the use of particular RDF vocabularies - guidance of the sort I think will be an essential component in the effective deployment of RDFa and the Linked Data approach. It's also the sort of thing that is of considerable interest to Eduserv, as a developer of Web sites for several government agencies. The document builds directly on the work Mark has been doing in this area, which I mentioned a while ago.

As Mark notes in his post, the document is unequivocal in its expression of the government's commitment to the Linked Data approach:

Government is committed to making its public information and data as widely available as possible. The best way to make structured information available online is to publish it as Linked Data. Linked Data makes the information easier to cut and combine in ways that are relevant to citizens.

Before the announcement of these guidelines, I recently had a look at the "argot" for consultations - "argot" is Mark's term for a specification of how a set of terms from multiple RDF vocabularies is used to meet some application requirement; as I noted in that earlier post, I think it is similar to what DCMI calls an "application profile" - , and I had intended to submit some comments. I fear it is now somewhat late in the day for me to be doing this, but the release of this document has prompted me to write them up here. My comments are concerned primarily with the section titled "Putting consultations into Linked Data"

The guidelines (correctly, I think) establish a clear distinction between the consultation on the one hand and the Web page describing the consultation on the other by (in paragraphs 30/31) introducing a fragment identifier for the URI of the consultation (via the about="#this" attribute). The consultation itself is also modelled as a document, an instance of the class foaf:Document, which in turn "has as parts" the actual document(s) on which comment is being sought, and for which a reply can be sent to some agent.

I confess that my initial "instinctive" reaction to this was that this seemed a slightly odd choice, as a "consultation" seemed to me to be more akin to an event or a process, taking place during an interval of time, which had a as "inputs" to that process a set of documents on which comments were sought, and (typically at least) resulted in the generation of some other document as a "response". And indeed the page describing the Consultation argot introduces the concept as follows (emphasis added):

A consultation is a process whereby Government departments request comments from interested parties, so as to help the department make better decisions. A consultation will usually be focused on a particular topic, and have an accompanying publication that sets the context, and outlines particular questions on which feedback is requested. Other information will include a definite start and end date during which feedback can be submitted and contact details for the person to submit feedback to.

I admit I find it difficult to square this with the notion of a "document". And I think a "consultation-as-event" (described by a Web page) could probably be modelled quite neatly using the Event Ontology or the similar LODE ontology (with some specialisation of classes and properties if required).

Anyway, I appreciate that aspect may be something of a "design choice". So for the remainder of the comments here, I'll stick to the actual approach described by the guidelines (consultation as document).

The RDF properties recommended for the description of the consultation are drawn mainly from Dublin Core vocabularies, and more specifically from the "DC Terms" vocabulary.

The first point to note is that, as Andy noted recently, DCMI recently made some fairly substantive changes to the DC Terms vocabulary, as a result of which the majority of the properties are now the subject of rdfs:range assertions, which indicate whether the value of the property is a literal or a non-literal resource. The guidelines recommend the use of the publisher (paragraphs 32-37), language(paragraphs 38-39), and audience (paragraph 46) properties, all with literal values, e.g.

<span property="dc:publisher" content="Ministry of Justice"></span>

But according to the term descriptions provided by DCMI, the ranges of these properties are the classes dcterms:Agent, dcterms:LinguisticSystem and dcterms:AgentClass respectively. So I think that would require the use of an XHTML-RDFa construct something like the following, introducing a blank node (or a URI for the resource if one is available):

<div rel="dc:publisher"><span property="foaf:name" content="Ministry of Justice"></span></div>

Second, I wasn't sure about the recommendation for the use of the dcterms:source property (paragraph 37). This is used to "indicate the source of the consultation". For the case where this is a distinct resource (i.e. distinct from the consultation and this Web page describing it), this seems OK, but the guidelines also offer the option of referring to the current document (i.e. the Web page) as the source of the consultation:

<span rel="dc:source" resource=""></span>

DCMI's definition of the property is "A related resource from which the described resource is derived", but it seems to me the Web page is acting as a description of the consultation-as-document, rather than a source of it.

Third, the guidelines recommend the use of some of the DCMI date properties (paragraph 42):

  • dcterms:issued for the publication date of the consultation
  • dcterms:available for the start date for receiving comments
  • dcterms:valid for the closing date ("a date through which the consultation is 'valid'")

I think the use of dcterms:valid here is potentially confusing. DCMI's definition is "Date (often a range) of validity of a resource", so on this basis I think the implication of the recommended usage is that the consultation is "valid" only on that date, which is not what is intended. The recommendations for dcterms:issued and dcterms:available are probably OK - though I do think the event-based approach might have helped make the distinction between dates related to documents and dates related to consultation-as-process rather clearer!

Oh dear, this must read like an awful lot of pedantic nitpicking on my part, but my intent is to try to ensure that widely used vocabularies like those provided by DCMI are used as consistently as possible. As I said at the start I'm very pleased to see this sort of very practical guidance appearing (and I apologise to Mark for not submitting my comments earlier!)

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345203ba69e2012875bd4f4f970c

Listed below are links to weblogs that reference COI guidance on use of RDFa:

Comments

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad