Hugh Winkler holding forth on computing and the Web

Tuesday, December 18, 2007

FIQL: Query language for feeds

My first reaction to Mark Nottingham's FIQL draft for querying Atom and RSS feeds was: Do we really need to bake in a prescription for constructing query URLs? Section 4 seems to suggest that. It seems to take the control over the URL namespace away from the server.

Here's Mark's example HTTP FIQL (I'm already saying "fickle") query:
http://example.org/feed.rss?title==*great*;ex:rating=gt=4
Element names become the query keys. So if your feed language has element <ex:rating>, you can use that as a query key.

FIQL describes rules for constructing URLs based on your content type. Contrast that to HTML forms which prescribe how to construct URLs based on content.

Is that so bad? It's a little too API-like for me. Any Atom feed now has that query namespace imposed on them: If you want to honor queries, these terms become part of your URI space. Amazon, Blogger, you, I all have this URI subspace imposed on us.

If, instead, we define a FIQL content type, application/fiql, and you POST a FIQL document, the server can respond with the query result, or can construct a URI of its own design and redirect to it.

The server retains complete control of its URI namespace.

I'm interested in how we work through this issue because I just suggested a similar strategy for WITSML queries a few days ago.

Friday, November 16, 2007

What is wrong with this picture?

Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
content-disposition: attachment; filename=Hill%20Country%20water%20issues[1].ppt
Content-Type: application/unknown

IIS servers don't come provisioned with the .ppt extension mapped to application/vnd.ms-powerpoint? Microsoft server? Microsoft application?

What hope can there be for authoritative metadata?

My Firefox browser figures it out just fine, and launches Open Office. Presumably FF went through all this first, just to determine it to be application/octet-stream; then ran a further sniffer to identify it as a Power Point.

Tuesday, November 06, 2007

It's the hyperlinks, Stupid!

Henry Story thinks Echo2 is "Web 2.0 in Java", and they do have a killer demo. But it's yet another example of incredibly brilliant developers going to great lengths to bring the desktop app to the browser, and ignoring that the value of the web lies in hyperlinks.

So navigate that demo: Click the "Next" arrow. Your address bar doesn't change even though you have navigated to what most people would call a clearly identifiable different resource.

I can't give you the link to "page 2" because it isn't addressable on the web. Sorry. Go to page 1, then click to go to page 2.

Under the hood, it's the usual collection of RPCs masquerading as URLs -- some GET, some POST -- using port 80 since we know it's open, after all. Every single GET uses all the usual tricks to make sure that nothing -- not even JPEGs -- gets cached.

(Why don't we just give Javascript in browsers an API to open up a socket and execute any protocol you like? Seriously: Wouldn't that be better than abusing HTTP? If you don't want to use the HTTP protocol, you shouldn't have to.)

I love the look of that demo, and I think their technology is clever. With a little effort I am sure they can make the framework webby.

Saturday, November 03, 2007

I was just thinking that!

I've been working on the technology requirements for a "rich web application". All the pieces are in place for a cross platform solution. With Javascript, HTML, SVG, and CSS, you can build a pretty rich application without resorting to JavaFX, AIR, Flex, Silverlight, Click Once, or Web Start.

All the browsers support JS, HTML, SVG and CSS. Except one. Per Rob Sayre:
If Microsoft were really interested in making life easier for web developers, they could do so, without a standards committee. They would need to fix the (nasty) bugs in IE’s JScript engine, implement SVG, implement canvas, implement more of CSS, support a standard event model, and on and on. Then, the behavior of IE would be a lot closer to Firefox, Opera, and Safari.
It's no secret Microsoft doesn't see an advantage in a web built on cross platform technologies. I'm not very sure about Adobe, either.

Why should we, as a a software company, invest in technology from companies that are actively working to subvert what we want to do?

Friday, October 26, 2007

Rich Web Applications

Here's the five step test to determine whether your "rich internet application" is a rich web application:
  1. You click on a hyperlink in a web browser.
  2. Your RWA opens up, and updates its code if necessary
  3. Your RWA renders the document, and you edit it.
  4. You save the file -- to its URL
  5. You e-mail the hyperlink to someone; they open up the document and see your edits.
You should be able to click a link to a Word document, edit the document, and save it -- to the URL you got it from. The RWA is a rich, articulate way to render and edit particular kinds of documents.

If saving isn't a feature of your app, fine: But honor (1) ,(2), and (3).

You can do this with .Net Click Once, Java Web Start, and for all I know, Adobe AIR. But the design of the first two, at least originally, amounted to a way to install and launch your application by clicking a hyperlink, and update the code automatically. Worthy objectives. But no actual Web Start or Click Once apps I have used can do this: Click a document, edit it, save on server. You couldn't possibly do it with Web Start until JDK 1.5. And to do it with Click Once you have to sign the application and party in the registry.

The first thing is to register your RWA as the helper application for your document's mime type and file extension. Web Start enabled doing so when they added <association> to the JNLP 1.5 format. Web Start registers itself as the handler for your file extension; the browser launches javaws to open your file, and javaws looks at the file extension and invokes your application. Or: sign your Click Once application and you can set your application as the mime type handler in the registry.

That makes your application launchable when you click a link.

You will have to design the URL of the document into the document itself. Browsers download documents and save them to temp files; they don't tell you the URL of the document. When the browser launches your helper app, your app needs to know where to HTTP PUT updates, or to DELETE the resource. You can also design in other URLS your client understands: a base URL under which you can POST to implement "File/New", for example. Notice Atom has <link rel="self"> and Atompub service documents tell you where to POST things; you could build a good RWA around that scaffolding. Make your own documents describe how to navigate the states of your application by traversing hyperlinks.

Your RWA should have an address bar. Show the users where they are. Let them type in any URL they want to. If they type in one for a mime-type you don't understand, hand off the download to the registered helper application, or back to the browser.

All those buttons desktop apps use to navigate? Menu commands? Keyboard shortcuts? Back them with URLs you GET from or POST to. Display each URL in the address bar as the user traverses your application. After clicking any button that does GET, whatever you see in your RWA should be what your friend will see when you email him the URL in the address bar.

Automatic code update (step 2) is a necessary feature for a RWA. It's not enough just to have a link launchable application. You can evolve your schema, and change the meanings of elements in your document format, as long as you know that only the most modern code, that understands the new meanings, will be interpreting the document.

Make the extra effort to leverage the web architecture, and I won't blog about your beautiful but shitty application.

Tuesday, October 02, 2007

That killer web platform

Here we go again with this specious argument that the Web isn't rich enough. Joel thinks there's a new killer platform out there to be invented, that will seize control of web applications as Windows seized the desktop.

Ain't gonna happen. Or, if you prefer: Already happened.

Right there in his own essay is the reason.


And that’s exactly where we are with Ajax development today. Sure, yeah, the usability is much better than the first generation DOS apps, because we’ve learned some things since then. But Ajax apps can be inconsistent, and have a lot of trouble working together — you can’t really cut and paste objects from one Ajax app to another, for example, so I’m not sure how you get a picture from Gmail to Flickr. Come on guys, Cut and Paste was invented 25 years ago.


See, Ajax gives you the capability to turn a perfectly good hypertext application into a miserable facsimile of a 1980's PC. And you're not going to fix Ajax by adding a bunch of new APIs. Applications need more constraints, not fewer.

Think how absurd it is that you can't copy a picture from GMail to Flickr. The tools are right there, but the application designers do not leverage them. a) Right click photo in GMail. b) "Copy link location". c) Paste hyperlink into Flickr. d) Flickr either downloads photo from GMail or references it. No new APIs needed -- it's all just hyperlinks.

It's great, and necessary, to extend HTML with rich widgets. We'll never capture them all, declaratively, in a common HTML. I am, even as we speak, constructing a Flash widget. But the web is the platform. Any time I push information deep into my widget -- text that could be searchable, graphics that could be linkable -- and hide it from the web, I've failed to leverage the platform.

Thursday, September 13, 2007

Oh crap, I just invented Prolog

Following a link trail that started with a discussion of CouchDB, I just found this old comment posted by Bill de hÓra:


Perhaps we can go top down - write smart analysers to dynamically denorm data based on usage patterns; indeed database optimisation is an industry sector. But another, dumber, option is bottom up - avoid the initial structural 'typing' step and normalise where necessary....Normalisation can be done later on, based on demand.


I don't even want to design database schemas. To hell with modeling. I just want a system that takes a great, undifferentiated pile of facts, and infers entities based on the statistics, primarily of the content, but also on the access patterns. Actually I don't even care that it infer entities; that's an implementation detail I don't need to know about. It would just optimize for query time, or insert time, or some combination; inferring some table structure will probably help it do that.

Why do I need to tell the computer the relations in my data? All I should have to do is insert facts, as triples. When I ask it a question, I want those facts returned (or maybe other conclusions).

Is a really tweaked Prolog the ultimate DBMS?

Sunday, September 09, 2007

Using Atom to manage lists of Atom feeds

As Phil Wilson does, I have come to rely on Planet Intertwingly as a substitute, largely, for maintaining my own list of interesting feeds. But now Phil wants to filter Sam's picks; he wants to select favorites from the PI OPML.

That raises the interesting prospect that we now need Atom feeds for OPML updates. When Sam adds a new feed, I would want to check it out. Since OPML content is, by definition, a list, I guess Sam could mark up the feed with Simple List Extensions. I'm temporarily suspending my suspicion of SLE, because PI's growing and shrinking list of feeds is clearly unlike an ordinary, infinitely growing feed.

Using Atom to manage lists of Atom feeds demonstrates that Atom captures a powerful abstraction.

Wednesday, August 15, 2007

Acura ITX

A year and a half ago, I started following this (anonymous?) guy's blog about his project to install an ITX computer in his Acura. It covers everything: selecting hardware including touch screen and GPS, configuring software (Windows + misc. media toys), installation, and he's recently updated it with new photos.

I'd do things a little differently: Carbuntu (hey... carbuntu.org seems to have disappeared...), and I'd want a HSDPA or EVDO card and a dual core Intel... but there's little doubt mine would come out a mess. This fellow's really put in the polish.

Restful Servlets + JSP: My framework

Stefan and Bill asked, so here's how I do it (with a tip of the hat to Django).

It is a micro framework. You subclass RestServlet and declare some URL patterns to match, and handlers for them. The base class parses the URI, sets attributes in the ServletRequest object based on the URI pattern, and invokes your handlers.

So here's how a simple BlogServlet would look:

[updated: fixed path to JSP, added a note above about attributes].
[update: added Apache license]

Copyright 2007 Wellstorm Development, LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.



public class BlogServlet extends RestServlet{


/**
* GET an entry. The base class will populate the "entryId" attribute before calling invoke.
* We told it to do so below, when we defined entryIdentifier.
*/
RequestHandler entryGetHandler = new RequestHandler(){

public void invoke(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
try {
// The base class has parsed the URI and populated the request with
// the "entryId" attribute
String entryId = (String)request.getAttribute("entryId");
String forward = "/WEB-INF/entry.jsp";
request.setAttribute("entryHTML", getEntryHTML(entryId));
request.setAttribute("entryTitle", getEntryTitle(entryId));

request.getRequestDispatcher(forward).forward(request, response);
} catch (Exception e) {
response.sendError(404);
}
}
};


/**
* POST a new entry to the collection URI
*/
RequestHandler collectionPostHandler = new RequestHandler(){

public void invoke(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
try {
String entryId = saveEntry(request, response);

response.setHeader("Location", buildEntryUri(entryId));
response.setStatus(201);
//write out some HTML or maybe the entry itself.
String forward = "/WEB-INF/entry.jsp";
request.setAttribute("entryHTML", getEntryHTML(entryId));
request.setAttribute("entryTitle", getEntryTitle(entryId));
request.getRequestDispatcher(forward).forward(request, response);
} catch (Exception e) {
response.sendError(404);
}
}
};

//Stubbing these in...
RequestHandler entryPutHandler =null;
RequestHandler entryDeleteHandler = null;
RequestHandler collectionGetHandler = null;

//
// One ResourceIdentifier per URI pattern.
// Each one tells us how to parse the pattern into attributes, and handlers
// for HTTP methods on the resources it identifies.
//
ResourceIdentifier entryIdentifier = new ResourceIdentifier(
"^/(\\d+)$", // URI pattern
new String[] {"entryId"}, // match pattern and insert named request attributes
entryGetHandler, // GET handler for entry URIs
null, // no POST handler for entries
entryPutHandler, // PUT an entry
entryDeleteHandler); // DELETE an entry


ResourceIdentifier collectionIdentifier = new ResourceIdentifier(
"^/$", // URI pattern
collectionGetHandler, // GET handler for collection would list entries
collectionPostHandler); // POST handler for collection will add and entry



@Override
/**
* Here's how we tell our base class how to map URIs to handlers:
*/
protected ResourceIdentifier[] resourceIdentifiers() {
return new ResourceIdentifier[]{entryIdentifier, collectionIdentifier};
}

// these are just stubs... exercise for the reader.
private String buildEntryUri(String entryId) {
return null;
}

private String saveEntry(HttpServletRequest request,
HttpServletResponse response) {
return null;
}

private Object getEntryTitle(String entryId) {
return null;
}
private Object getEntryHTML(String entryId) {
return null;
}

}




Here's the base RestServlet class. In real life this class also has convenience methods to send redirects, and other standard HTTP stuff.



public abstract class RestServlet extends HttpServlet {

private static final long serialVersionUID = 1L;

private static Logger logger = Logger
.getLogger(RestServlet.class.getName());

protected RestServlet() {
super();
}

protected abstract ResourceIdentifier[] resourceIdentifiers();

/** try calling doGet, doPost, or whatever, on each ResourceIdentifier, until one succeeds.
Uses reflection to reduce bloat.
*/

private void doMethod(String methodName, HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
boolean did = false;
try {
// The method must be like e.g.
// boolean doGet(HttpServletRequest request, HttpServletResponse response)
Method method = ResourceIdentifier.class.getMethod(methodName, new Class[]{HttpServletRequest.class, HttpServletResponse.class});
Object [] params = new Object[]{request, response};
for (ResourceIdentifier rid : resourceIdentifiers()) {
if (did = (Boolean)method.invoke(rid, params)) {
break;
}
}
if (!did) {
response.sendError(HttpServletResponse.SC_NOT_FOUND);
}
} catch (Exception e) {
logger.log(Level.SEVERE, "Exception in doPost", e);
throw new ServletException(e);
}

}

@Override
protected void doPost(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
doMethod ("doPost", request, response);

}

@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
doMethod ("doGet", request, response);
}

@Override
protected void doPut(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
doMethod ("doPut", request, response);
}

@Override
protected void doDelete(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
doMethod ("doDelete", request, response);
}



Here's the ResourceIdentifier class. It tells us how to map URI patterns to handlers, and maps matched pattern groups to attribute names

public class ResourceIdentifier {

private final Pattern pattern;
private final String[] attributeNames;
private final RequestHandler getHandler;
private final RequestHandler putHandler;
private final RequestHandler postHandler;
private final RequestHandler deleteHandler;

public ResourceIdentifier(String regex, String[] attributeNames, RequestHandler getHandler, RequestHandler postHandler,
RequestHandler putHandler, RequestHandler deleteHandler) {
this.pattern = Pattern.compile(regex);
this.attributeNames = attributeNames;
this.getHandler = getHandler;
this.postHandler = postHandler;
this.putHandler = putHandler;
this.deleteHandler = deleteHandler;
}


public ResourceIdentifier(String regex, RequestHandler supportsGet) {
this(regex, new String[] {}, supportsGet, null, null, null);
}

public ResourceIdentifier(String regex, String[] attributeNames, RequestHandler supportsGet) {
this(regex, attributeNames, supportsGet, null, null, null);
}

public ResourceIdentifier(String regex, RequestHandler supportsGet, RequestHandler supportsPost) {
this(regex, new String[] {}, supportsGet, supportsPost, null, null);
}

public ResourceIdentifier(String regex, String[] attributeNames, RequestHandler supportsGet, RequestHandler supportsPost) {
this(regex, attributeNames, supportsGet, supportsPost, null, null);
}

public boolean doGet(HttpServletRequest request, HttpServletResponse response)
throws Exception {

return doMethod(this.getHandler, request.getPathInfo(), request, response);
}

public boolean doPost(HttpServletRequest request, HttpServletResponse response)
throws Exception {

return doMethod(postHandler, request.getPathInfo(), request, response);
}

public boolean doPut (HttpServletRequest request, HttpServletResponse response)
throws Exception {

return doMethod(putHandler, request.getPathInfo(), request, response);
}

public boolean doDelete(HttpServletRequest request, HttpServletResponse response)
throws Exception {

return doMethod(deleteHandler, request.getPathInfo(), request, response);
}

/**
* Test the uri against our pattern. If matched, dispatch to the handler.
*/
private boolean doMethod(RequestHandler handler, String uri, HttpServletRequest request, HttpServletResponse response) throws Exception {
if (uri == null){
uri = "";
}
Matcher matcher = pattern.matcher(uri);
boolean bDid;
if (matcher.matches()) {
if (handler == null) {
response.sendError(HttpServletResponse.SC_METHOD_NOT_ALLOWED);
bDid = true;
} else {
dispatch(matcher, request, response, handler);
bDid = true;
}
} else {
bDid = false;
}
return bDid;
}


private void dispatch(Matcher matcher, HttpServletRequest request, HttpServletResponse response,
RequestHandler listener) throws Exception {

// The regex matched. Extract all the named attributes from the URL and
// set them as attributes on the request. Then invoke RequestHandler.
int n = matcher.groupCount() ;
if (n != attributeNames.length) {
throw new RuntimeException("must have same number of matches as attribute names");
}
for (int i = 0; i < n; i++) {
request.setAttribute(attributeNames[i], matcher.group(i + 1));
}
listener.invoke(request, response);
}
}


The RequestHandler interface is the merest slip of a thing:


public interface RequestHandler {
void invoke(HttpServletRequest request, HttpServletResponse response) throws Exception;
}

Wednesday, August 08, 2007

We don't need PATCH

John Panzer asks:
....if PUT can be used to send just the part you want to change. This can be made to work but has some major problems that make it a poor general choice.
* A PUT to a resource generally means "replace", not "update", so it's semantically surprising.
* In theory it could break write-through caches. (This is probably equivalent to endangering unicorns.)
* It doesn't work for deleting optional fields or updating flexible lists such as Atom categories.


We could PUT an instance of a new MIME type, atom-update+xml, and the meaning of that document could be: Please selectively update fields of the entry resource, including optional fields and Atom categories.

Tuesday, June 12, 2007

RIA can really suck

Rome Reborn is somebody's Flash wet dream. The content is spectacular (if you're into Roman history, as I am). But some lame brain has encapsulated all the content behind that one URL. I can't give you links to the interesting parts. I have to give you verbal instructions for navigating to different parts of the site. "Wave your mouse over the initial image until the Colosseum is mostly in view, then click, then...." In other words, it's like any desktop application -- which seems to be the idea behind rich internet applications -- "improve" the web until it sucks as much as Windows.

It's maddening, because the site contains tons of useful information -- but none of it is on the web. It's a black hole from which no information can escape.

If new RIA tools don't encourage authors to expose linkable resources, they're just going to continue to be irrelevant to the web.

Saturday, June 09, 2007

Microsoft vs TestDriven.Net Express

A sad Microsoft drone named Jason is getting famous for the worst reasons. He's a program manager for MS Visual Studio and he's hassling a well meaning fellow named Jamie who wrote a unit testing plugin for VS 2005.

Jamie wrote the plugin using VS 2005 Express, the "free" version, and got enough traction that MS made him an MVP. But then Jason noticed that the plugin supports VS Express. Bad Jamie -- no plugins allowed for Express.

Jamie has published the fatuous correspondence Jason directed his way. They took away his MVP, and MS lawyers are sending cease and desist letters.

Um, the guy is developing free software enhancing the value of all the VS products? Did I mention that part?

Great coverage from The Register.

Friday, June 08, 2007

Hyperbolism of the Month

...goes to (the envelope please)... Elliotte Rusty Harold!:
Java’s exception handling is the single best error handling and reporting mechanism ever built into a programming language.

I personally don't think you're going to beat the Common Lisp condition system. You can emulate Java-style exception handling, but it's so much richer to be able to restart in the context where the error occurred, without unwinding the stack.

Wednesday, June 06, 2007

What is Lisa Simpson Doing to that Poor Man?

Once someone has pointed out that interpretation of the London 2012 Olympics logo, it's forever ruined, isn't it? Sorry. (via DPF).

Lisa simpson

Monday, May 21, 2007

Fifty-four forty or fight!

Today I flunked this quiz question asked of registrants for a Dell sweepstakes:

Where in the U.S. did Dell recently build a Customer Contact Center?

  • Roseburg, Oregon
  • Oklahoma City, Oklahoma
  • Edmonton, Alberta
  • All of these
Yep -- I missed the correct answer: "All of these." (When did we get Alberta?)

Monday, May 07, 2007

Friday, May 04, 2007

RIA Not Advancing the Ball

Rich widgets obscure the semantics of hypertext. Only the code behind the form knows what the widget really does. Contrast to HTML 5, and in particular Web Forms 2. These guys are extending HTML to capture what we really do on the web. As a consequence, client programs can (or, have a chance to) understand the meanings of hypertext documents from the web.

Example: you want to write a script to automate some remote bookmark service, as part of your mashup. But this service, unlike del.icio.us, has no documented "API". So you have to download its form, complete it programmatically, and POST an entity.

Case 1: The form uses Plain Old HTML. You're golden. All the semantics are right there for you to parse, or read. You identify the name of the text box where you stick the URL, and the name of the text box where you add a description. You compose the URL encoded form data, and POST it to the action URI.

Case 2: The "form" uses Javascript to modify the DOM on the fly: the onload() method adds text boxes, and a submit button, to an empty DOM. In fact, it might not even use the submit button as a form element; when you press the button, its onclick() might send a custom XMLHttpRequest back to the server. Your code will never automate this interaction.

Case 3: The "form" uses XAML + Silverlight plugin. An exacerbated case of (2).

In contrast, Web Forms 2 attempts to capture the semantics of what we do with forms. Because browsers will understand more of the semantics of the form, we can do declaratively what we now have to do in Javascript. For example, lots of HTML forms now have to use script to add a row of controls to a form on the fly ("Click here to add another bookmark"). WF2 captures that as repeating control groups, and the browser can handle it.

(Then again, I am a documented forms nut.)

Mike Dierken justly analogizes: "RIA is to user interfaces as RPC is to messaging interfaces". And notice, it's Rich Internet, not Rich Web Applications. These technologies pay lip service to the web, but they're not advancing the ball toward building more and better links throughout the web information space.

P.S. Wonderful rant by Mark Pilgrim!

Update: fixed a link.

Thursday, May 03, 2007

RIA -- Fill 'er Up!

I'm having an ongoing email exchange with my friend Peter. He's convinced MS and Adobe herald a new age of Rich Internet Applications. He pointed me to this guy who's backed up a tanker to the Kool-Aid trough.

Sure, MS and Adobe have to sell something as the Next Thing -- what else have they got? But we've had RIA ever since Java 1.1 applets. We have Flash. We have <embed> and <object>. Do you really think what's been holding RIA back is the technology?

Users have voted with their mice, and they've voted for the web experience -- exploring the web information space using hyperlinks -- as far more important than whizzy UI. Ask eBay. Ask MySpace.

Flash, applets, Silverlight, Javascript -- the more you use them, the suckier your web apps are at exploring the web information space. I don't think it has to be this way, but it takes a design discipline few seem to have. These programming models are from the 80s. They have web APIs, but they're not web oriented. Programs end up as little desktop applications, not web apps. I don't see Silverlight changing that. It is good to have super expressive widgets -- hear hear. But if you're not pushing a bunch of hypertext down to my browser, you're not helping me explore the space.

Friday, April 27, 2007

The Penny Drops

It's enjoyable, and instructive, to watch the penny drop for venerable DCOMster/SOAPster Tim Ewald: I finally get REST. Wow.

Instructive, because coming from a strong RPC perspective, Tim illuminates the distributed application problem with slightly different shades. I like this graphical model:

Every communication protocol has a state machine. For some protocols they are very simple, for others they are more complex. When you implement a protocol via RPC, you build methods that modify the state of the communication. That state is maintained as a black box at the endpoint. Because the protocol state is hidden, it is easy to get things wrong. For instance, you might call Process before calling Init....The essence of REST is to make the states of the protocol explicit and addressible by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you're moving to, making that your new state. A state's representation includes the links (arcs in the graph) to the other states that you can move to from the current state.

Tuesday, April 17, 2007

Austin to Paris in 30 days

Since I'm traveling to Paris soon, thought I'd get directions from Google Maps. It's going to take 30 days, 9 hours. Notice item 28. (via Peter Flanagan).

Tuesday, April 10, 2007

Microsoft is Dead

Ha! I've been telling people Microsoft has become irrelevant. And now Paul Graham crystallizes the thought. Especially rich:
...I'm now surprised when I come across a computer running Windows. Nearly all the people we fund at Y Combinator use Apple laptops. It was the same in the audience at startup school. All the computer people use Macs or Linux now. Windows is for grandmas, like Macs used to be in the 90s. So not only does the desktop no longer matter, no one who cares about computers uses Microsoft's anyway.

An irascible colleague at a large software company used to say, "Hugh, you have to understand: XYZ isn't really a software company. It's an old folks home for software." XYZ had the same problem PG describes:
Microsoft's biggest weakness is that they still don't realize how much they suck. They still think they can write software in house. Maybe they can, by the standards of the desktop world. But that world ended a few years ago.

Tuesday, April 03, 2007

The cure is worse than the disease

This paper from Fortify makes the case that sending sensitive information using JSON exposes it to cross-site maliciousness. GMail sent your contact list down as JSON and evaled it. Turns out, any old site could do the same: just put a <script> tag referencing that contact list, and install some interceptor code that overloads setting e.g. the "email" property on any object: That enables the malicious code to see the values in the JSON.

Here are a couple of their proposed measures:

1. "Add the session cookie to the request as a parameter." Knee-slapper, that. See, the exploit only works because vulnerable sites put your identity into the cookie, and use a single URL for all users to download the object; the server uses the cookie to send you your personalized contact list. So the attacker just has to hardcode <script src="http://yoursite.com/contact-list">. The paper proposes uniquifying the URL. Here's an idea: design your app so that each user's info is at a unique URL in the first place!

2. Send all legitimate requests for JSON data using HTTP POST! That way you know any GET requests are malicious ones from <script> tags. They do concede that "The use of GET for better performance is encouraged by Web application experts from Sun and elsewhere". There's no use for this measure if you use unique URLS, of course.

So yeah, this is a serious problem, but not for apps using best web architecture practices. Millions of web developers read papers like that and then crap all over the web.

Saturday, March 31, 2007

Life imitates art

During the Atom Publishing Protocol process, posting your cat pictures was a recurring use case. Now there's a whole site devoted to social cat picture publishing. John Panzer, is that you?

Update: I guess it's not John -- they only expose an RSS 2.0 feed.

Tuesday, February 27, 2007

Glitch Undercuts the Dow

The Dow fell over 400 points today. And some IT guy's ass is fired. From Stocks Have Worst Day Since 9/11 Attacks:

The Dow's decline accelerated at a faster than normal pace during the afternoon after a computer glitch kept some trades from being immediately reflected in the index of 30 blue chip stocks. Dow Jones & Co., the media company which manages the flagship index, said the problem occurred after it was discovered computers were not properly calculating trades, prompting a switch to a backup computer.

The result was a massive plunge in the average in the seconds it took Dow Jones to switch to its secondary computers.

Wednesday, February 14, 2007

SOA Facts

I came across this hilarious list of way more than ten SOA Facts, including

SOA is an anagram for OSA, which means female bear in spanish. It is a well-known fact in the spanish-speaking world that female bears are able to model business processes and optimize reusable IT assets better than any other hibernating animal

SOA actually stands for SOA Oriented Architecture

SOA is also a yoga posture that consists of performing all other yoga postures simultaneously


Submit your own fact
. (This site really really needs a feed).

Friday, February 09, 2007

Tuesday, February 06, 2007

Forms Language Use Case

Tim Bray on the Atompub list:
If I fetch a service doc with a collection with no <app:categories>, does that mean the server is suggesting that I can post any category I want, or that I can't post any category at all?

This question would a non-issue if APP used a forms language. A form for submitting an entry would have a) an enumerated list of choices, b) a free form text field, or c) no field at all to submit the category. No ambuiguity.

Monday, February 05, 2007

Web 0.9

Sure took a long time to pay the car note this evening. I wish the developers had read Mark's caching tutorial (or rather that their J2EE framework developers had). Below are headers representative of about a hundred .gif, .css, and .js resources used on the page:


GET /navigation/images/global/company.gif HTTP/1.1
Host: www.financecompany.example.com

...

HTTP/1.x 200 OK
Server: IBM_HTTP_Server/2
Last-Modified: Tue, 20 Sep 2005 18:24:36 GMT
Etag: "31e467-450-2c11c100"
So far, so good

Accept-Ranges: bytes
Content-Length: 1104
Content-Type: image/gif
Expires: Mon, 05 Feb 2007 07:31:41 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Mon, 05 Feb 2007 07:31:41 GMT
Huh? It's a GIF that hasn't changed in a year.

Connection: keep-alive

Tuesday, January 30, 2007

Floyd Fairness Fund

During the final week of the Tour de France, I published my initial doubts about the "science" accusing Floyd Landis of doping. All evidence since then reinforces my doubts.

Ultimately it all comes back to character: Do you believe the man or not? The evidence of doping that they have, alone, isn't convincing, but it would be consistent with doping. Lots of guys have gone to the mat lying. Floyd could just be one more.

I just don't think he is.

And now I know it. Because I don't think it's possible anyone, even Richard Virenque, would form a legal defense fund taking money from elderly cycling enthusiasts like me just to reclaim a ride in the pro peloton. If Floyd had doped and had created this fund... well, nobody is that low. If you doubt his word now, you are saying, Yes Hugh: He is THAT LOW.

I gave. I think it's important. It's really important that the innocent get off, if we're going to have faith in the convictions of the guilty.

Monday, January 29, 2007

Wait, *my* new publishing technique is unstoppable

[updated to correct an omission in POST description]

Bill outlines the conventional Atompub publishing pattern. I'm pitching this one:


GET Introspection URI

scan the list of workspaces for the collection you want to post the blogpost to. This list is not an Atom service document, but a microformat outline e.g. XOXO. (The Atompub WG considered and dropped using XOXO).

GET to Collection URI

read the nice atom feed

GET the "New Entry" URI from either the service doc or the feed doc.

Retrieve an XHTML microform describing how to construct a POST message

POST to the blogpost's collection URIthe action URI of the form.

push a blogpost formatted as a nice atom entryas URL-encoded form data.

GET or HEAD to blogpost URI

grab the blogpost

PUT or DELETE to blogpost URI

change or delete the blogpost

Wednesday, January 24, 2007

Web Service of the Future: Comment Spam

Guess what web technology, right now, most successfully uses microforms a.k.a. forms-driven web services: Comment spam. Spambots crawl the web, looking for form fields they recognize, completing them, and submitting them. There's no one universal blog comment document format. Each site tells the bot, by using typical names in form fields, how to complete the form.

Imagine you had some service you actually want agents to be able to understand: a fare search site. Farebots would query your site for fares from Austin to Bangalore leaving Thursday. They'd know how to fill out the form field because you'd annotate each one with some agreed class attribute. Doesn't this sound a lot more realistic than getting everyone to agree on a common fare search document?

Thursday, January 18, 2007

Forms-driven Atom

Here's how forms-driven Atom publishing would work. Let's drop the Xforms idea, and take a page from the smart microformats people: We'll overload XHTML forms with semantics, so that blogging clients can parse them, fill them in correctly, and then POST or PUT them. Call the idea "microforms".

Want an example? Your browser, right now, can auto fill lots of form fields based on the field name. And your browser retrieves that information from its own little database. Let's just generalize that idea, and furnish client programs with the ability to understand lots more about the service it's exercising.

Where do we get the semantics for Atom? The Atom syntax document goes a long way. Henry Story has constructed an OWL ontology for Atom. This is a good starting point. We can use those property and class names in XHTML class attributes.

Remember the motivation: We want web services to spread as powerfully as the real WWW has. The Atom group believes RESTful web services will do it. But little in Atom, or any other document driven protocol, resembles the actual web we experience. In Atom, they agree on a document format that carries semantics that all servers and clients have to understand, or at least handle gracefully. That concept has no analogy on the web. When you order airline tickets, do you complete the One Universal Reservation form? No. Each airline or agency has its own form, because each business has its own differentiators.

It's the same in web publishing. Consider Moveable Type and Blogger. To create an entry for MT you fill out this form:





They have two places to enter content! One is the lead, the other is the "Read more...." section. There's no concept like that in Atom. I guess the MT guys will propose an extension. But then every other blogging service on the planet will have to handle the case where somebody submits an "mt:extendedContent" extension element. Blogger doesn't have that concept:



I marked up those screen captures with tags grabbed loosely from the Atom ontology. We can make up more. Here's a microform MT could present (modulo GUI labels and layout):


<form method="post" action="/feed/">
<input type="text" name="title" class="atom:title"/>
<select name="category" class="atom:label">
<option value="Programming"/>
<option value="Politics"/>
<option value="Personal"/>
</select>

<textarea name="entry" class="atom:htmlContent"/>
<textarea name="extended" class="atom:htmlContent"/>
<input type="checkbox" name="draft" class="app:control" value="app:draft"/>
</form>



Later we can hash out the details of these class attributes. Maybe we do need an MT specific extension for that extended content. But only MT servers, or any service supporting extended content, ever will have to deal with clients that submit that field.

So the microforms idea is: You describe the semantics of the form elements in their class attributes. You program clients to GET forms, understand form elements, and fill them out correctly, getting the information from some kind of database, or from a live user.

If we do it well enough, we can build agents that syndicate stuff to all kinds of servers automatically.

(updated: fixed omitted class attribute in form example).

Wednesday, January 17, 2007

We're Writing Hardware

Using statically typed languages, we're constructing hardware, not software, says Steve. A great rant. Let me extend the analogy. Think about the bullet points they used to sell object oriented programming back in the 80s: "Hardened components", "Reusable". These qualities are the opposite of dynamic.

Wait! There's an alternative!

Common Lisp is in many ways really ideal: it's a dynamically typed language with optional type annotations (i.e. you build software, then selectively turn it into hardware), lots of great tools and documentation, all of the essential features of living software I've enumerated in this essay... However, it has stopped growing, and programmers can sense momentum like a shark senses blood.


Oh man. Don't give up on the old girl. The Lisp Renaissance is happening all around you. We have the popular book; now we just need the Django/Rails thing.

Thursday, January 11, 2007

The Volkswagen Lisp

I've been working in Lisp for about six weeks now, and I am far enough along to
get this.

Blog Archive