Tuesday, June 28, 2005

Tagging is easy. Semweb is hard.

Clay Shirky's Ontology is Overrated post is a great survey of why tagging is hot, semweb is not.

But there is a bit of strawman in the argument. Clay sets up ontologies to be hierarchcial organizations of concepts having firm inside/outside boundaries, and easily shows how inadequate that scheme is to describe the web.

But the semweb people actually designed in many of the features Clay likes about tagging. Anyone can make a new OWL ontology describing resources in an idiosyncratic way. OWL concepts overlap: a resource can belong to thousands of concepts. And most definitely, the link topologies are not restricted to hierarchies: an RDF graph looks just like the web.

Tagging is not taking off becuase it describes the web better than semweb. Tagging is taking off because it's easy, and semweb is hard. wouldn't have gotten very far if you had to define your own classes and properties.

But everything you do in, could be done using semweb. A tag is an OWL class with a name and that's about it: No other properties, but the members of the class (your tagged URIs) imply something about the class. You could take each user's tags and create his own personal ontology. No need to adopt anyone else's ontology. But to expand on Clay's "mind reading" analogy: If I determine that another user says "movies" to mean the same thing I do when I say "cinema", I could make that mapping through an OWL equivalence. Then all that user's "movies" tags become trusted indicators for movies in my searches.

Clay is on the money with the observation that the real meanings of the terms are emerging from the statistical fog. You can make observations about the relations between tags based on the URLs that share them. "This URL probably represents a movie, as you think of 'movie'" is the function I need.

