Hugh Winkler holding forth on computing and the Web

Tuesday, June 28, 2005

Tagging is easy. Semweb is hard.

Clay Shirky's Ontology is Overrated post is a great survey of why tagging is hot, semweb is not.

But there is a bit of strawman in the argument. Clay sets up ontologies to be hierarchcial organizations of concepts having firm inside/outside boundaries, and easily shows how inadequate that scheme is to describe the web.

But the semweb people actually designed in many of the features Clay likes about tagging. Anyone can make a new OWL ontology describing resources in an idiosyncratic way. OWL concepts overlap: a resource can belong to thousands of concepts. And most definitely, the link topologies are not restricted to hierarchies: an RDF graph looks just like the web.

Tagging is not taking off becuase it describes the web better than semweb. Tagging is taking off because it's easy, and semweb is hard. del.icio.us wouldn't have gotten very far if you had to define your own classes and properties.

But everything you do in del.icio.us, could be done using semweb. A tag is an OWL class with a name and that's about it: No other properties, but the members of the class (your tagged URIs) imply something about the class. You could take each user's tags and create his own personal ontology. No need to adopt anyone else's ontology. But to expand on Clay's "mind reading" analogy: If I determine that another user says "movies" to mean the same thing I do when I say "cinema", I could make that mapping through an OWL equivalence. Then all that user's "movies" tags become trusted indicators for movies in my searches.

Clay is on the money with the observation that the real meanings of the terms are emerging from the statistical fog. You can make observations about the relations between tags based on the URLs that share them. "This URL probably represents a movie, as you think of 'movie'" is the function I need.

Monday, June 27, 2005

Service specific operations and machines

Service independent operations are valuable when the agent invoking them is the kind of agent that talks to lots of different services: a web browser + a human to make sense of what he browses and make choices accordingly.

If the agent is service specific, then heck, just design in all the specific operations you want.

Machine to machine conversations are almost always service specific. You have to program the client to understand how to proceed through the legal application states. The guy programming the client needs to know... First you do this to get this result, then you use that result to make a second query, and so on. It's no help to have a service independent operation set if you're using it that way.

Service independent operations would be valuable in the machine to machine case if you could invent a surrogate for the human: an intelligent machine agent able to make choices, given some output from the last operation. I've said that before, I know. Just thought it would be useful to state it a little differently.

(Caveat: written after a 16 hour day constructing SOAP services for machine to machine cases).

Tuesday, June 21, 2005

WSDL 2.0

Dave Orchard's remarks on WSDL 2 encourage me to replace my own cooked up SDL with WSDL 2. Sounds a little daunting though: "WSD WG decided that the specs were for toolkit authors not wsdl document authors."

I will report experimental results here.

RESTful Web Service Descriptions

I've fleshed out some of the details I left dangling in my previous post about WITSML. The WITSML stuff is so specialized, I've created a separate blog for it.


But if you are interested in the web description language (web-http-desc) discussion there is some meat here for you as well. This draft documents RESTful access to WITSML services, and writing it was a great exercise for the description discussion. (I didn't write it as an exercise; our product implements some of that protocol right now; but I had never written it down in one place).

WITSML is "Well Information Transfer Standard" markup language. It defines not only document formats, but a SOAP API that has the following operations:

AddToStore
GetFromStore
UpdateInStore
DeleteFromStore

Look familiar? Well GetFromStore is a query, and has a query string parameter, but generally, you can do 90% of WITSML just by GET, PUT, and DELETE on objects, and the other 10% are vanilla things you could do with POST.

If you are interested in the web description language discussion, I invite you to have a look at the WITSML+REST draft and comment here or on the list.