RDFa 1.1 drafts available from W3C
Last week, the W3C RDFa Working Group announced the availability of two new "First Public Working Drafts" which it is circulating for comment:
- RDFa Core 1.1: Syntax and processing rules for embedding RDF through attributes
- XHTML+RDFa 1.1: Support for RDFa via XHTML Modularization
Ivan Herman, the W3C Semantic Web Activity Lead, and a co-editor of these documents, has provided a very helpful summary of their main features, and particularly of some of the differences they introduce whem compared with the current W3C Recommendation for RDFa RDFa in XHTML: Syntax and Processing: A collection of attributes and processing rules for extending XHTML to support RDF. I think the intent is that the new drafts maintain compatibility with the current recommendation, in the sense that all the features used in XHTML+RDFa 1.0 are also present in RDFa 1.1. I should reiterate what Ivan says at the start of his piece: these are drafts and features may change based on feedback received.
Some of the most interesting features in these drafts, at least for data creators, are those which enable a more concise/compact style of RDFa markup. One of the criticisms of the initial version of RDFa, particularly from communities unfamiliar with RDF syntaxes, was the dependency on the use of prefixed names, in the form of CURIEs, two-part names made up of a "prefix" and a "reference", mapped to URIs by associating the prefix with a "base" URI, and concatenating the reference part of the CURIE with that URI. In XHTML the prefix-URI association was made through an XML Namespace Declaration. In particular, arguments against this approach focused on problems of "copy-and-paste", where a document fragment including RDFa markup was extracted from a source document without also copying the in-scope XML Namespace declarations, and as a result the RDF interpretation of the fragment in the context of a different (paste target) document was changed. More generally, there were some concerns that the use of prefixes was difficult to explain and understand, at least when compared with the "unprefixed name" styles typically adopted in approaches like microformats.
The new drafts introduce several mechanisms which can simplify markup for authors.
I should emphasise that my examples below are based on my fairly rapid reading of the drafts, and any errors and misrepresentations are mine!
The @vocab attribute
@vocab is a new RDFa attribute which provides a means for defining a "default" "vocabulary URI" to which the "terms" in attribute values are appended to construct a URI.
Aside: I should note here that I'm using the word "term" in the sense it is used in the RDFa 1.1 draft, where it refers to a datatype for a string used in an attribute value; this differs from usage by e.g. DCMI where "term" typically refers to a property, class, vocabulary encoding scheme or syntax encoding scheme i.e. to the "conceptual resource" identified by a DCMI URI, rather than to a syntactic component. In RDFa 1.1, "terms" have the syntactic constraints of the NCName production in the XML Namespaces specification.
This mechanism provides an alternative to the use of a CURIE (with prefix, reference and namespace declaration) to represent a URI.
Consider an example based on those from my recent post about RDFa (1.0) and document metadata (This is a "hybrid" of the examples 1.1.5, 1.3.5, and 2.1.5 in that post):
XHTML+RDFa 1.0:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:dc="http://purl.org/dc/terms/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" version="XHTML+RDFa 1.0"> <head> <title>My World Cup 2010 Review</title> </head> <body> <h1 property="dc:title">My World Cup 2010 Review</h1> <p>About: <a rel="dc:subject" href="http://example.org/resource/2010_FIFA_World_Cup"> The 2010 World Cup </a> </p> <p>Date last modified: <span property="dc:modified" datatype="xsd:date">2010-07-04</span> </p> </body> </html>
This represents the following three triples (in Turtle):
@prefix dc: <http://purl.org/dc/terms/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix ex: <http://example.org/resource/> . <> dc:title "My World Cup 2010 Review" . <> dc:subject ex:2010_FIFA_World_Cup . <> dc:modified "2010-07-04"^^xsd:date .
XHTML+RDFa 1.1 using @vocab:
Using the @vocab attribute on the body element to set http://purl.org/dc/terms/ as the default vocabulary URI, I could write this as:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" version="XHTML+RDFa 1.1"> <head> <title>My World Cup 2010 Review</title> </head> <body vocab="http://purl.org/dc/terms/"> <h1 property="title">My World Cup 2010 Review</h1> <p>About: <a rel="subject" href="http://example.org/resource/2010_FIFA_World_Cup"> The 2010 World Cup </a> </p> <p>Date last modified: <span property="modified" datatype="xsd:date">2010-07-04</span> </p> </body> </html>
In that case, where just three properties are referenced, the reduction in the number of characters is minimal, but if several properties from the same vocabulary were referenced, then the saving could be more substantial.
The @vocab approach provides limited help where, as is often the case, terms from multiple RDF vocabularies are used in combination (e.g. the example above continues to use a CURIE for the URI of the XML Schema date datatype), but other features of RDFa 1.1 are useful in those cases.
RDFa Profiles and the @profile attribute
Perhaps more powerful than the @vocab attribute is the new RDFa 1.1 feature known as the RDFa profile, and the @profile attribute:
RDFa Profiles are optional external documents that define collections of terms and/or prefix mappings. These documents must be defined in an approved RDFa Host Language (currently XHTML+RDFa [XHTML-RDFA]). They may also be defined in other RDF serializations as well (e.g., RDF/XML [RDF-SYNTAX-GRAMMAR] or Turtle [TURTLE]). RDFa Profiles are referenced via @profile, and can be used by document authors to simplify the task of adding semantic markup.
Let's take each of these two functions - defining terms and defining prefix mappings - in turn.
Defining term mappings in an RDFa profile
An RDFa profile can provide mappings between "terms" and URIs. The following example provides four such "term mappings", for the URIs of three properties from the DC Terms RDF vocabulary and for the URI of one XML Schema datatype:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdfa="http://www.w3.org/ns/rdfa#" version="XHTML+RDFa 1.1"> <head> <title>My RDFa Profile for a few DC and XSD terms</title> </head> <body> <h1>My RDFa Profile for a few DC and XSD terms</h1> <ul> <li typeof="rdfa:TermMapping"> <span property="rdfa:term">title</span> : <span property="rdfa:uri">http://purl.org/dc/terms/title</span> </li> <li typeof="rdfa:TermMapping"> <span property="rdfa:term">about</span> : <span property="rdfa:uri">http://purl.org/dc/terms/subject</span> </li> <li typeof="rdfa:TermMapping"> <span property="rdfa:term">modified</span> : <span property="rdfa:uri">http://purl.org/dc/terms/modified</span> </li> <li typeof="rdfa:TermMapping"> <span property="rdfa:term">xsddate</span> : <span property="rdfa:uri">http://www.w3.org/2001/XMLSchema#date</span> </li> </ul> </body> </html>
Note that - in contrast to the case of CURIE references - the content of the "term" doesn't have to match the trailing characters of the URI; so for example, here I've mapped the term "about" to the URI http://purl.org/dc/terms/subject. So sets of "terms" corresponding to various community-specific or domain=specific lexicons could be mapped to a single set of URIs.
Also a single RDFa profile might provide mappings for URIs from different URI owners - the example above reference three DCMI-owned URIs for properties and a W3C-owned URI for a datatype. Conversely, different subsets of URIs owned by a single agency may be referenced in different RDFa profiles.
If the URI of this RDFa profile is http://example.org/profile/terms/, then I can reference it in an XHTML+RDFa 1.1 document, and make use of the term mappings it defines. So taking the example above again, and now using @profile to reference the profile and its term mappings:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" version="XHTML+RDFa 1.1"> <head> <title>My World Cup 2010 Review</title> </head> <body profile="http://example.org/profile/terms/"> <h1 property="title">My World Cup 2010 Review</h1> <p>About: <a rel="about" href="http://example.org/resource/2010_FIFA_World_Cup"> The 2010 World Cup </a> </p> <p>Date last modified: <span property="modified" datatype="xsddate">2010-07-04</span> </p> </body> </html>
The @profile attribute may appear on any XML element, so it is possible that an element with a @profile attribute referencing profile A may contain as a child element with a @profile attribute referencing profile B.
<body profile="http://example.org/profile/a/"> <h1 property="title">My World Cup 2010 Review</h1> <div profile="http://example.org/profile/b/"> <p>About: <a rel="about" href="http://example.org/resource/2010_FIFA_World_Cup"> The 2010 World Cup </a> </p> </div> </body>
And the value of a single @profile attribute may be a whitespace-separated list of URIs.
<body profile="http://example.org/profile/a/ http://example.org/profile/b/"> </body>
One of the questions I'm not quite sure about is what happens if the same "term" is mapped to different URIs in different profiles. I think, but I'm not 100% sure, only a single mapping is used and a single triple is generated, but I'm not sure about the precedence rules for determining which mapping is to be used.
As Ivan notes, probably the most common pattern for deploying RDFa profiles will be for the owners/publishers of RDF vocabularies (such as DCMI) to publish profiles for their vocabularies, and for data providers to simply reference those profiles, rather than creating their own.
Defining prefix mappings in an RDFa profile
RDFa 1.1 continues to support the use of XML Namespace Declarations to associate CURIE prefixes with URIs (see my first example above and the use of the XML Schema datatype) but it also introduces other mechanisms for achieving this. One of these is the ability to supply CURIE prefix to URI mappings in RDFa profiles.
The following example provides four such "prefix mappings", for the URIs of three DCMI vocabularies and for the URI of the XML Schema datatype vocabulary:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdfa="http://www.w3.org/ns/rdfa#" version="XHTML+RDFa 1.1"> <head> <title>My RDFa Profile for DC and XSD prefixes</title> </head> <body> <h1>My RDFa Profile for DC and XSD prefixes</h1> <ul> <li typeof="rdfa:PrefixMapping"> <span property="rdfa:prefix">dc</span> : <span property="rdfa:uri">http://purl.org/dc/terms/</span> </li> <li typeof="rdfa:PrefixMapping"> <span property="rdfa:prefix">dcam</span> : <span property="rdfa:uri">http://purl.org/dc/dcam/</span> </li> <li typeof="rdfa:PrefixMapping"> <span property="rdfa:prefix">dcmitype</span> : <span property="rdfa:uri">http://purl.org/dc/dcmitype/</span> </li> <li typeof="rdfa:PrefixMapping"> <span property="rdfa:prefix">xsd</span> : <span property="rdfa:uri">http://www.w3.org/2001/XMLSchema#</span> </li> </ul> </body> </html>
If the URI of this RDFa profile is http://example.org/profile/prefixes/, then I can reference it in an XHTML+RDFa 1.1 document, and make use of the prefix mappings it defines. Taking the example above again, and using @profile to reference this second profile and its prefix mappings:
<?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" version="XHTML+RDFa 1.1"> <head> <title>My World Cup 2010 Review</title> </head> <body profile="http://example.org/profile/prefixes/"> <h1 property="dc:title">My World Cup 2010 Review</h1> <p>About: <a rel="dc:subject" href="http://example.org/resource/2010_FIFA_World_Cup"> The 2010 World Cup </a> </p> <p>Date last modified: <span property="dc:modified" datatype="xsd:date">2010-07-04</span> </p> </body> </html>
As in the case of term mappings, the issue arises of what happens in the case that two profiles provide different prefix-URI mappings for the same prefix. I think the CURIE datatype is based on the notion that at a point in a document, for prefix p, a single prefix-URI mapping is in force for that prefix, so I assume there are precedence rules for establishing which of the profile prefix mappings is to be applied.
Access to profiles and changes to triples?
Although the RDFa 1.1 profile mechanism is a powerful mechanism, it also introduces a new element of complexity for consumers of RDFa. In RDFa 1.0, an XHTML+RDFa document is "self-contained", by which I mean an RDFa processor can construct an interpretation of the document as a set of RDF triples using only the content of the document itself. In RDFa 1.1, however, the interpretation of terms and prefixes may be determined by the term mappings and prefix mappings specified in profiles external to the document containing the RDFa markup.
Consider my last example above. When the processor encounters the @profile attribute it retrieves the profile and obtains a list of prefix-URI mappings to be applied in subsequent processing, and when it encounters the CURIE "dc:title" it generates the URI http://purl.org/dc/terms/title
But if for some reason, the processor is unable to dereference the URI, and doesn't have a cached copy of the referenced profile, then it does not have those mappings available. In that case, for my example above, when the processor encounters the CURIE "dc:title" it would not have a mapping for the "dc" prefix, and (I think?) would instead (with the new "URI everywhere" rules in force) treat the string "dc:title" as a URI? (See e.g. the section on CURIE and URI Processing)
In the case where two profiles are referenced, and both provide a mapping for the same prefix, then it seems possible that the prefix mapping in force might change depending on the availability of access to the profiles.
I lurk on the RDFa WG list, and I've seen various discussions of how these sort of issues should be handled - see, for example, this thread on "What happens when you can't dereference a profile document?", though related issues surface in other discussions too. I suspect the current draft is far from the "last word" in this area, and these are the sort of issues on which the authors are seeking feedback.
Summary
I've focused here only on a few "highlights" of the RDFa 1.1 drafts, and Ivan's post covers a couple more which I won't discuss here (the use of the @prefix attribute to provide CURIE prefix mappings and the ability to use URIs in contexts where previously CURIEs were required), but I hope they give a flavour of the sort of functionality which is being introduced. The examples here are based on my understanding of the current drafts, but I may have made mistakes, so please do check out the drafts rather than relying on my interpretations.
It seems to me the WG is trying hard to address some of the criticisms made of RDFa 1.0, and to provide mechanisms that make the provision of RDFa markup simpler while retaining the power and flexibility of the syntax and ensuring that RDFa 1.0 data remains compatible. In particular, it seems to me the "term mapping" feature of RDFa profiles may be very useful in "shielding" data providers from some of the complexity of name-URI mappings and prefixed names, especially once the owners of commonly used RDF vocabularies start to make such profiles available.
However, such flexibility doesn't come without its own challenges. and it also seems that the profile mechanism in particular introduces some complexity which I imagine will become a focus of some discussion during the comment period for these drafts. Comments on the drafts themselves should be sent to the RDFa Working Group list.
Pete,
thanks to the nice words...
On your question for profile precedence: the precedence is simply defined by the order in which profile files are listed in the attribute value; profile files are processed left to right and terms are overwritten by the rightmost value (if there is a clash).
As for the case of non-reachable profiles: yes, this is still under discussion at the group. For example, Jeni Tennison's latest proposal is to simply disregard all triples in an (XML) subtree if the profile file cannot be reached...
Cheers
Ivan
Posted by: Ivan Herman | May 05, 2010 at 08:25 AM