Archive for February 2010
I’ve just been looking at the new Google Buzz API. What struck me first was that it is built up completely from other established protocols: Atom extended with Atom Threading Extensions, MediaRSS, Activity Streams, and PubSubHubbub. Notes on future directions include support for Atom Publishing Protocol for posting and editing messages, Salmon for upstream aggregation of comments, WebFinger for contact information resolution, and OAuth for delegating authorization. Not to mention that all of this is made possible by yet another layer of underlying open technologies such as XML (XML Namespaces in particular) and HTTP.
Google’s GData APIs are mostly built on Atom, with some custom extensions to support data elements that aren’t native to Atom. In this way, any client that understands how to deal with Atom data can get at least basic data from the Atom feed. This is an example of graceful degradation of data. I wrote several days ago about needing a data API for the web, but really we may just need common building blocks for creating our web APIs. User authentication is a case in point: many web services that don’t use basic HTTP authentication roll their own complicated token-based authentication schemes that take time to figure out before you even start getting data from the API. Standardizing on OAuth would not only ensure a basic level of security throughout the web, but it would ease the development of new services that consume data from many sources.
Everything that I have been talking about here is composition of technologies. I think that we are going to start seeing more composition of services that actually implement those technologies. There are risks when deciding to leverage a particular technology in a solution such as the cost of re-engineering the product to use a new technology later on if needed. However this risk at least doubles when you consider the case of lock-in where a third-party service is being used along with a proprietary protocol. Not only do you have to redesign some part of the product to use a new protocol, you have to migrate your data. In order for computing as a utility to work we have to be able to plug things together efficiently.
HTML won for human-readable data.
We can point our browsers in any direction we please and we can expect to get something that is at least partially decipherable as human-readable content.
The same is not true for data yet.
What is the equivalent for data? It is not XML, since there are no semantics without XML schema. So is it XML with XML schema? The closest thing is Atom or RSS. What about RDF? I think it is languishing due to its complexity.
Also, things like Atom and RDF would just be the data format, not a protocol or full API. SOAP would be the closest thing that we have, but again it is too complicated – we would need a verbose WSDL to understand how to call the SOAP API.
We need something that:
- Is simple
- Degrades gracefully
- Is general
Does it need to be able to transfer something complex like genome data? I don’t think so. Most of what gets shuttled around is remarkably similar in structure. How different is an email from a tweet or a blog post? Even if we did need to transfer something like genome data, Google showed through the development of GData that with extensions to an open standard (Atom) many different types of APIs can be created despite the simplicity of the base protocol.
RSS proved that a relatively simple data format could be incredibly useful if it was universally supported. I think that a simple data format along with a simple HTTP-based API could be just as useful.
The age of mashups is just the beginning of the composable web. Current mashups are mostly aggregators. If the cloud keeps going as it has, most web applications themselves will be composed of smaller services entirely.
Cloud services at the storage and operating system level are just the beginning. Self-contained environments such as Google App Engine and Microsoft Azure are just the beginning.