Hugh Winkler holding forth on computing and the Web

Friday, July 29, 2005

SPARQL Last Call

W3C have issued the last call for SPARQL comments. They feel compelled to tell us that you pronounce SPARQL "sparkle". It is not too late to request a change in pronunciation to "sparquel" before it goes final.
I just don't think I can convert to "sparkle" now. Nevertheless, I used to think I could never convert to "lih-nix" from "lye-nix", but I've been overwhelmed by society (and by Linus).

Saturday, July 09, 2005

Reliable POST

A couple of proposals for lightweight reliable POST are circulating: Mark Nottingham's Post Once Exactly (POE) and Paul Prescod's Reliable Delivery in HTTP. Similar in spirit, both techniques propose that servers generate one-off URLs for clients to POST to. Generally the pattern is

-> GET url
<- entity and/or header containing one-off URL
-> POST one-off-url

The one-off URL behaves specially: it only changes application state the first time you POST to it.


The two proposals differ in how the server responds to multiple POSTs. Paul proposes the server simply return the same response it returned when it processed the first request
The response of subsequent POSTs should be the same as if there had been only one POST so that the client can get the correct response even if there is a network outage in the middle of the first response.


Mark proposes that under POE the server return 405 Method Not Allowed on the second and subsequent POSTs to a POE resource:
If the server had received and accepted the first request, it will respond with
S: 405 Method Not Allowed HTTP/1.1
Allow: GET
...
If the response status is "405 Method Not Allowed" the client can infer that the earlier POST succeeded. A 2xx response indicates that earlier POST did not succeed, but that this one has. When the client receives either of these responses, it knows that the request has been accepted, and it can stop retrying.

Under Paul's proposal if the first POST failed (e.g. 401 Unauthorized), then even if the user corrected the problem, POSTing the corrected form to the resource would still return the same error status. POE's approach really permits exactly one POST to succeed. POE does seem to impose some new semantic over HTTP: If a client receives a 405 it can stop retrying. In practice clients, naive ones not underststanding POE, would never retry after receiving a 405 anyway. It is a new feature on the HTTP landscape that a resource could return success or various failure status codes, until at some point it changes state and complains that POST is not allowed. Nothing wrong with that, and it won't break any clients. But it has not been common to see that behavior.

What about those special POE headers? The proposal acknowledges they're unneccessary -- why use them? If your web site has a form page that POSTs to a POE resource, you can now put text next to the submit button saying "Press repeatedly!" The problem there is that if your browser displays a 405 error, you would not understand that your POST had succeeded -- unless the server also returned a comforting HTML entity telling you that.

It's a minor problem that POE overloads the semantics of 405, becuase it's not a failure really. If you are a POE-aware agent, then the special POE headers tell you to interpret the 405 slightly differently: The operation really did succeed! But it succeeded before this latest POST. I'd prefer that the superfluous operation return a success code.

Maybe we need a synthesis of the two approaches. Paul's instincts were right: A second POST to a resource that had earlier successfully processed a POST, should return the same 2XX or 3XX code, and the same entity, as the first one. After all, does a client really need to understand that this second POST was superfluous? If so, POE proposes that a GET to the POE resource return the created entity, if any, and we should retain that behavior. This way an HTML page could have a simple hyperlink to the POE URL, that would return that entity indicating "transaction succeeded". POE does not say what the server should return in response to a GET if the POST has not yet been processed. A 404 Not found would not be helpful. I suppose it has to be a 200 OK with an explanatory entity: an HTML page saying "Still waiting..." or some such.

If the initial POST to the resource fails with 4XX or 5XX, the server ought to continue to accept POST attempts until finally one of them reults in a 2XX or 3XX success. The semantic we want is that POST succeed exactly once.


[Security concern about one-off URLs: servers must prevent malefactors from predicting one-off URLs and hijacking them. It's probably good enough to generate very long random numbers as part of the URL.]