Linked data vs. Web of data vs. ...
On Friday I asked what I thought would be a pretty straight-forward question on Twitter:
is there an agreed name for an approach that adopts the 4 principles of #linkeddata minus the phrase, "using the standards (RDF, SPARQL)" ??
Turns out not to be so straight-forward, at least in the eyes of some of my Twitter followers. For example, Paul Miller responded with:
@andypowe11 well, personally, I'd argue that Linked Data does NOT require that phrase. But I know others disagree... ;-)
@andypowe11 I'd argue that the important bit is "provide useful information." ;-)
Paul has since written up his views more thoughtfully in his blog, Does Linked Data need RDF?, a post that has generated some interesting responses.
I have to say I disagree with Paul on this, not in the sense that I disagree with his focus on "provide useful information", but in the sense that I think it's too late to re-appropriate the "Linked Data" label to mean anything other than "use http URIs and the RDF model".
To back this up I'd go straight to the horses mouth, Tim Berners-Lee, who gave us his personal view way back in 2006 with his 'design issues' document on Linked Data. This gave us the 4 key principles of Linked Data that are still widely quoted today:
- Use URIs as names for things.
- Use HTTP URIs so that people can look up those names.
- When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL).
- Include links to other URIs. so that they can discover more things.
Whilst I admit that there is some wriggle room in the interpretation of the 3rd point - does his use of "RDF, SPARQL" suggested these as possible standards or is the implication intended to be much stronger? - more recent documents indicate that the RDF model is mandated. For example, in Putting Government Data online Tim Berners-Lee says (refering to Linked Data):
The essential message is that whatever data format people want the data in, and whatever format they give it to you in, you use the RDF model as the interconnection bus. That's because RDF connects better than any other model.
So, for me, Linked Data implies use of the RDF model - full stop. If you put data on the Web in other forms, using RSS 2.0 for example, then you are not doing Linked Data, you're doing something else! (Addendum: I note that Ian Davis makes this point rather better in The Linked Data Brand).
Which brings me back to my original question - "what do you call a Linked Data-like approach that doesn't use RDF?" - because, in some circumstances, adhering to a slightly modified form of the 4 principles, namely:
- use URIs as names for things
- use HTTP URIs so that people can look up those names
- when someone looks up a URI, provide useful information
- include links to other URIs. so that they can discover more things
might well be a perfectly reasonable and useful thing to do. As purists, we can argue about whether it is as good as 'real' Linked Data but sometimes you've just got to get on and do whatever you can.
A couple of people suggested that the phrase 'Web of data' might capture what I want. Possibly... though looking at Tom Coates' Native to a Web of Data presentation it's clear that his 10 principles go further than the 4 above. Maybe that doesn't matter? Others suggested "hypermedia" or "RESTful information systems" or "RESTful HTTP" none of which strikes me as quite right.
I therefore remain somewhat confused. I quite like Bill de hÓra's post on "links in content", Snowflake APIs, but, again, I'm not sure it gets us closer to an agreed label?
In a comment on a post by Michael Hausenblas, What else?, Dan Brickley says:
I have no problem whatsoever with non-RDF forms of data in “the data Web”. This is natural, normal and healthy. Stastical information, geographic information, data-annotated SVG images, audio samples, JSON feeds, Atom, whatever.
We don’t need all this to be in RDF. Often it’ll be nice to have extracts and summaries in RDF, and we can get that via GRDDL or other methods. And we’ll also have metadata about that data, again in RDF; using SKOS for indicating subject areas, FOAF++ for provenance, etc.
The non-RDF bits of the data Web are – roughly – going to be the leaves on the tree. The bit that links it all together will be, as you say, the typed links, loose structuring and so on that come with RDF. This is also roughly analagous to the HTML Web: you find JPEGs, WAVs, flash files and so on linked in from the HTML Web, but the thing that hangs it all together isn’t flash or audio files, it’s the linky extensible format: HTML. For data, we’ll see more RDF than HTML (or RDFa bridging the two). But we needn’t panic if people put non-RDF data up online…. it’s still better than nothing. And as the LOD scene has shown, it can often easily be processed and republished by others. People worry too much! :)
Count me in as a worrier then!
I ask because, as a not-for-profit provider of hosting and Web development solutions to the UK public sector, Eduserv needs to start thinking about the implications of Tim Berners-Lee's appointment as an advisor to the UK government on 'open data' issues on the kinds of solutions we provide. Clearly, Linked Data is going to feature heavily in this space but I fully expect that lots of stuff will also happen outside the RDF fold. It's important for us to understand this landscape and the impact it might have on future services.