« Making the UK Federation usable | Main | Investigating the "Scott Cantor is a member of the IEEE problem" »

July 01, 2009

RESTful Design Patterns, httpRange-14 & Linked Data

Stefan Tilkov recently announced the availability of the video of a presentation he gave a few months ago on design patterns (& anti-patterns) for REST. I recommend having a look at it, as it covers a lot of ground and has lots of useful examples, and I find his presentational style strikes a nice balance of technical detail and reflection. If you haven't got time to listen, the slides are also available in PDF (though I do think hearing the audio clarifies quite a lot of the content).

One of the questions that this presentation (and other similar ones) planted at the back of my mind is that of how some of the patterns presented might be impacted by the W3C TAG's httpRange-14 resolution and the Cool URIs conventions for distinguishing between what it calls "real world objects" and "Web documents", some of which describe those "real world objects". The Cool URIs document focuses on the implications of this distinction on the use of the HTTP protocol to request representations of resources, using the GET method, but does not touch on the question of whether/how it affects the use of HTTP methods other than GET.

In the early part of his presentation, Stefan introduces the notion of "representation" and the idea that a single resource may have multiple representations. Some of the resources referred to in his examples, like "customers" (slide 16 in the PDF; slide 16 in the video presentation), when seen from the perspective of the Cool URIs document, fall, I think, into the category of "real world objects" - things which may be described (by distinct resources) but are not themselves represented on the Web. So, following the Cool URIs guidelines, the URI of a customer would be a "hash URI" (URI with fragment id) or a URI for which the response to an HTTP GET request is a 303 redirect to the (distinct) URI of a document describing the customer.

But what about non-"read-only" interactions, and using methods other than GET? The third "design pattern" in the presentation is one for "resource creation" (slide 55 in the PDF; slide 98 in the video presentation). Here a client POSTs a representation of a resource to a "collection resource" (slide 50 in the PDF; slide 93 in the video presentation). The example of a "collection resource" used is a collection of customers, with the implication, I think, that the corresponding "resource creation" example would involve the posting of a representation of a customer, and the server responding 201 with a new URI for the customer.

I think (but I'm not sure, so please do correct me!) that the implication of the httpRange-14 resolution is that in this example, the "collection resource", the resource to which a POST is submitted, would be a collection of "customer descriptions", and the thing posted would be a representation of a customer description for the new customer, and the URI returned for the newly created resource would be the URI of a new customer description. And a GET for the URI of the description would return a representation which included the URI of the new customer.

Restcool

(In the diagram above, http://example.org/customers/123 is the URI of a customer; http://example.org/docs/customers/123 is the URI of a document describing that customer

And, finally, a GET for the URI of the customer (assuming it isn't a "hash URI") would - following the Cool URIs conventions - return a 303 redirect to the URI of the description.

There is some discussion of this is in a short post by Richard Cyganiak, and I think the comments there bear out what I'm suggesting here, i.e. that POST/PUT/DELETE are applied to "Web documents" and not to "real-world objects".

The comment by Leo Sauermann on that post refers to the use of a SPARQL endpoint for updates - the SPARQL Update specification certainly addresses this area. It talks in terms of adding/deleting triples to/from a graph, and adding/deleting graphs to/from a "graph store". I think the "adding a graph to a graph store" case is pretty close to the requirement that is being addressed by the "post representation to Collection Resource" pattern. But I admit I struggle slightly to reconcile the SPARQL Update approach with Stefan's design pattern - and indeed, he highlights the "endpoint" notion, with different methods embedded in the content of the representation, as part of one of his "anti-patterns", their presence typically being an indicator that an architecture is not really RESTful.

I should emphasise that I'm trying to avoid seeming to adopt a "purist" position here: I recognise that "RESTfulness" is a choice rather than an absolute requirement. However, interest in the RESTful use of HTTP has grown considerably in recent years (to the extent that some developers seem keen to apply the label "RESTful", regardless of whether their application meets the design constraints specified by the architectural style or not). And now the "linked data" approach - which of course makes use of the httpRange-14 conventions - also seems to be gathering momentum, not least following the announcement by the UK government that Tim Berners-Lee would be advising them on opening up government data (and his issuing of a new note in his Design Issues series focussed explicitly on government data). It seems to me it would be helpful to be clear about how/where these two approaches intersect, and how/where they diverge (if indeed they do!). Purely from a personal perspective, I would like to be clearer in my own mind about whether/how the sort of patterns recommended by Stefan apply in the post-httpRange-14/linked data world.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8345203ba69e2011571881f05970b

Listed below are links to weblogs that reference RESTful Design Patterns, httpRange-14 & Linked Data:

Comments

I wonder if httpRange-14 reccommendations should be considered a choice and not a requirement to avoid purism. If I had to choose, I'd rather have REST. Maybe. Being devil's advocate partially, but I find the httpRange-14 stuff somewhat unsatisfactory, partially because it IS so confusing, even for people who are more or less immersed in it. How likely is something that is so confusing to become a widely adopted pattern in the real world?

Pete,

What you are suggesting is very close to what Andy Houghton and I have been saying about combining Linked Data, AtomPub, and domain modeling.

http://q6.oclc.org/2009/04/the_union_of_do.html

The main difference appears to be that Andy and I apply AtomPub to create, update, and delete real world objects, whereas you seem to be leaning in the opposite direction. I think there are distinct advantages to our way, but a blog comment probably isn't the right forum to explain why.

Jeff

@Jonathan,

I don't know how likely the httpRange-14 principle is to be more widely adapted. I do think it is a common requirement to be able to distinguish a thing from a document describing the thing, and httpRange-14 seems to help.

I guess the question I was asking myself was, given that that approach *is* being adopted in the context of "linked data", what are the implications for the use, in that context, of patterns such as those described in the presentation?

@Jeff,

I think you are making two separate points:

(a) Atom/AtomPub can be used to take advantage of a generalised model of collections/items and a generalised representation format;

(b) From looking at http://q6.oclc.org/lda/LDA%20Fundamental%20Behaviors.pdf and specifically the table on page 4 (Real World Instance) & the table on page 5 (Instance Generic Document), I think you are saying that

(i) the former ("real world objects") support PUT and DELETE (and GET, typically responding with a 303 redirect to the URI of an Instance Generic Document)
(ii) the latter ("web documents") do not support PUT or DELETE (and GET, typically responding with a 200 and a representation, which may involve conneg but let's not worry about that here)

On (a), I agree that Atom/AtomPub could be used (though I don't think it has to be used). I think Tilkov discusses this briefly in the presentation where he describes Atom as offering a "view" sitting between the completely generic ("it's a resource") and the quite specific ("it's a customer/collection of customers")

On (b), OK, this is similar to the question I was pondering, though I was considering a slightly different scenario. I must admit I struggle to grasp how an HTTP server can respond to PUT or DELETE for a "real world object".

For DELETE, the HTTP spec says

"The DELETE method requests that the origin server delete the resource identified by the Request-URI."

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.7

But if I have a URI for Nick Griffin, the leader of the British National Party in the UK (a person, a "real world object"), I don't see how I can use the HTTP protocol to ask my HTTP server to delete the person called Nick Griffin.

OTOH, I can use the HTTP protocol to ask my server to delete a document describing Nick Griffin.

Similarly for the PUT case

"The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server. If the Request-URI does not point to an existing resource, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI."

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.6

I don't see how I can use the HTTP protocol to ask my HTTP server to create a modified version of the person called Nick Griffin or to create a new person called Nick Griffin.

OTOH, I can use the HTTP protocol to ask my server to create a modified version of a document describing the person called Nick Griffin or to create a new document describing the person called Nick Griffin (and in the course of that process, also create a new URI for the person called Nick Griffin to which it will respond with a 303 redirect to the new document).

Against what I've just argued(!), I've read Mark Baker argue the case e.g. in slide 57 of http://www.slideshare.net/StuC/oopsla-2007-the-web-distributed-objects-realized and in the comment http://dowhatimean.net/2006/11/restwebarch-question#comment-14585 for a "RESTful lightbulb" controlled from the Web, where an HTTP PUT is indeed used to modify the state of the light bulb as real-world object, by switching it on and off.

I wonder how important it is to be able to distinguish a thing from a document about a thing _at the HTTP level_, as httpRange-14 attempts to do, to let you figure out if it's a 'real world thing' or 'document' from the HTTP response codes and such.

Of course, you can make mistakes if you think you have an identifier for one thing, but really have an identifier for another thing, and then you make or consume assertions that aren't about what you think they are about. But 'real world' vs. 'document' is hardly the only 'confuse case' (I just made that up!) where this can happen. If you think you have an identifier about me, but it's really about a paper I published; you think you have an identifier about Oberlin College, but it's really about the city of Oberlin OH; you think you have an identifier about a journal (is a journal a web document or a real world object, by the way?), but it's really an identifier about a publisher; you think you have an identifier of one plant species, but it's really an identifier for just one variety of that species. Etc etc.

In all of those cases, we hope that metadata _about_ the identifier (presumably in RDF) will help you get the right one. Why do we need a special HTTP-layer way to distinguish the special case of 'real world' vs 'document', especially when those categories themselves are kind of confusing not only to the laymen but even to those immersed in it! Why not just go with REST, and count on RDF sameAs and versionOf etc etc to tell you what an identifier is and how it relates to others? The counter-argument is that would lead to confusion and people mis-using identifiers... but I don't see httpRange-14 leading to anything but confusion anyway.

@Jonathan,
you might be right about the 'confusion' thing but it seems to me that the time to argue against httpRange-14 has passed... it's what we have (at least if we want to operate within a W3C/semantic Web/linked data view of the world) and we might as well work with it?

The comments to this entry are closed.

About

Search

Loading
eFoundations is powered by TypePad