Hugh Winkler holding forth on computing and the Web

Saturday, June 28, 2008

Asynchronous HTTP POST

I've got to process a huge POST asynchronously. If it were small enough I'd just return 201 Created, with the URL of the new resource in the Location header. But this is a massive file upload that requires a bunch of processing, and checking, before I can create the resource. It takes so long to process, your TCP connection will time out.

202, right?

This is a job for 202 Accepted, right? I had always understood 202 was how to respond ansynchronously to a POST. I return some hypertext with a link you can follow to see the status. You follow that link, load the status page, and hit refresh until it shows "100% done" and displays yet another link to the resource you created. That's what the RFC says to do.

In a machine to machine case, e.g. Atompub, I have to define meaningful content types so that clients can follow the hyperlinks to learn how the POST came out. If you're thinking of using the Location header: It isn't blessed for 202; even if it were, we couldn't guarantee that the resource at the returned URL will ever exist.

Why not 303?

I prefer to respond with 303 See Other. Clients follow that Location header to a status page, eliminating the extra manual step required by 202. Retrieving that page would itself return 202 Accepted, until I've created the resource, or failed to. Then that URL would return 201 Created, or whatever the result would have been in the synchronous case.

This way, user agents just follow the semantics of HTTP, and never need to understand any application entities.

Summarizing:
  • Any user agent will knows to follow the 303 to the status page automatically. This is the URI giving "the response to the request", which is a status page. Any time you want the response to this POST request, go to that URI.

  • GET on that status URI, for some time, returns 202 and an entity giving "an indication of the request's current status".

  • Finally, at some time when you check that URI, you get the final "response to the request".
The surprising part might be encountering 202 or 201 in response to GET. But nothing says you can't, and in fact that is the "response to the request".

(Ben Ramsey discusses returning 202 from the POST. It gets a little messy.)

No comments: