At the web 2.0 conference. I was able to sit down with the leading advocates for two very different advocates for two very different approaches to enriching the information embedded in blogs. The two were Tantek Çelik, of Technorati and Bob Wyman, of PubSub.
Let me explain what I mean with an example. Say you write a blog entry with information related to an upcoming event, like this:
Symposium on Social Architecture
Just a reminder that space is limited at the upcoming Symposium on Social Architecture scheduled for 14-15 November 2005 at the Harvard Law School, Cambridge MA, held in collaboration with the Berkman Center.
In this example, I have provided certain information -- name of the event, date and location -- but if readers want to use that information they would have to cut and paste it, for example, if they want to add the event to their personal calendar. Imagine if there were some way for the author of the post to add some additional information, meta-information, about the content of the post, so that the information embedded can be extracted automatically by tools, and/or presented in a distinctive way. In this case, the appropriate tool would 'know' about embedded calendar information, and the display might somehow indicate that the post held calendar information. The same arguments hold if the post was a movie review, or contact information.
I first sat down with Tantek, who walked me through the microformat approach to this problem. This approach is based on adding specific CSS classes to URLs associated with the embedded information, and using an XHTML (extended HTML) approach. In the case of adding event info to the post, it would be annotated like so:
<span class="vevent">Symposium on Social Architecture
Just a reminder that space is limited at the upcoming <a class="http://www.corante.com/events/ssa"><span class="summary">Symposium on Social Architecture</span></a> scheduled for <abbr class="dtstart" title="2005-11-14">14</abbr>-<abbr class="dtend" title="2005-11-16">15 November 2005</abbr> at the <span class="location">Harvard Law School, Cambridge MA</span>, held in collaboration with the Berkman Center.</span>
The class names are derived from the attributes associated with iCalendar format, so class="summary" indicates the name of the event, class="dtstart" the starting date, and class="location" the location.
Note: I tried to create a calendar of events in the left margin using this approach, and I have a few problems in importing the entities into my iCal. They import, but the dates don't always seem to work. The culprit may be the the specific XSLT (Extensible Stylesheet Language Transformations) file that is being used to convert the XHTML encoded calendar information into an iCal file. This is not a flaw of the specification, or the approach, but the particular implementation available at this point. Update: Turns out I just had a conceptual problem: I was thinking that dtend worked differently. If an event is intended to run on 15 Nov, without a tipulated hour of ending, then you should encode dtend as 16 Nov!)
My natural inclination is to adopt the microformat model, perhaps because I have already delved deep into hacking my MT templates, and manually encoding Technorati tags on posts. Those who are less handy with the technical feel of xhtml may find this microformatting intimidating, however. In the future, blog tools that either create microformatted information using forms or other user-friendly approaches will decrease the complexities involved in microformats, and some of these are now becoming available.
Bob Wyman wants us to go a different way, avoiding the microformat embedding of information into xhtml classes, and instead relying on an XML-based approach called Structured Blogging. Unlike microformats, structured blogging relies on blog plugins, which makes it easier to use, but limits its application to things that blogging tools support, like the creation of blog posts.
I don't use Wordpress -- the only platform for which a structured blogging plugin is available -- but the website demonstrates the neat look of book and music reviews encoded by structured blogging. Note the 'four out of five" graphics.
Structured blogging relies on the specification of an XML layout for each of the associated forms of posts: reviews, calendar entries, and the like. These correspond, more or less, to the reapplication of calendar and address book standards in the microformats approach.
Tantek's arguments for microformats include the adoption of the approach by a bunch of different companies and individuals: an argument for openness. Bob suggests that structured blogging is just as open, since others can collaborate in the process. My viewpoint is that it is almost impossible to disassociate the interests of these individuals and their respective companies from the discussion of the pros and cons of these approaches. Tantek and Technorati have been very successful in getting adoption of the Technorati tag format, which is a microformat based on the use of the 'rel="tag"' attribute as a means to indicate that a URI is a tag reference. Technorati now has one of the largest tagspaces in the world, if not the largest. Perhaps they would like in the future to develop databases bursting with contact and event information, too. Bob and Pubsub would like to get people creating structured blog posts so that Pubsub can more easily make sense of reviews: for example, determining the average review of "Understanding Comics". As a result of this conflict of interest, we need to discount the arguments of the proponents.
My gut feel is that structured blogging requires too much formalization of what people do on their blogs, and microformatting tools are more likely to be adopted in a dynamic, bottom-up, changing, and innovative environment. However, adoption of structured blogging will certainly be accelerated by the roll-out of other blogging tools plugins, which are in the works. It may come down to a battle of the tools -- who creates a better set of tools for authors -- rather than the pros and cons of the models themselves. My bet is on microformats, but there is definitely an important footrace going on in this corner of the blogosphere.
Update: I just noted that Upcoming.org provides microformatted calendar information for all its events.