New directions in web architecture. Again.

A focus on data is finishing the revolution that started with Ajax.

In 2005, Jesse James Garrett at Adaptive Path published the seminal blog “Ajax: A New Approach to Web Applications” and ushered in new age of web architecture. Ajax meant using the possibilities latent in JavaScript (specifically, the XMLHttpRequest object) so that a web page could contact the server asynchronously and request new data.

This was revolutionary; within months, we were seeing pages that were more dynamic and interactive. Ajax short-circuited the submit/response loop that dominated web applications up to that time. Instead of making an HTTP request, receiving an entire web page, and rendering that page as a replacement for the current page, the browser requested a chunk of data. It used that chunk of data to interact with the DOM and rewrite the page it was displaying on the fly.

Around the same time, the RESTful paradigm started taking hold. REST represented a much simpler, web-oriented way for servers to interact with their clients. As Roy Fielding pointed out in his dissertation, the basic operations of the HTTP protocol were capable of providing general access to data; stateful applications could be built upon stateless protocols; hypermedia could be used to maintain application state. Although Fielding’s dissertation dates back to 2000, it took a few years of bad experience with SOAP and its heirs to realize its importance. With REST, it becomes easier for a website to see itself as a source of data for machines to process, rather than as a source of content for humans to read. Websites become data servers.

New Twitter
The HTML page you get from the new Twitter is largely a bunch of empty divs, with a big wad of JavaScript. The JavaScript is the entire application.

Important as Ajax and REST have been to the history of the web, each only represents half of a larger revolution. And in the past few months, we’ve seen some new sites that have taken the revolution to its logical conclusion. Specifically: take a look at the new Twitter. It’s a nice web application, sure — but look at the HTML. There’s not much there. The HTML page you get from Twitter is largely a bunch of empty divs, with a big wad of JavaScript. What’s happening? The JavaScript is the entire application; the divs exist only to provide tags so the JavaScript can rewrite the DOM at will. In turn, the JavaScript is constantly (and asynchronously) making requests from the Twitter site, which is just returning data from its API. In fact, the Twitter site is returning the same data for its web page that it would return for its mobile app, for TweetDeck, or for any of the apps in the Twitter ecosystem.

This design isn’t particularly new; we’ve seen it ever since developers started reverse-engineering GMail and Google Maps to get ideas for their own projects. Those big Google apps may have been the first examples of this architectural trend. They were certainly among the first to use JavaScript as a full-fledge client programming language. But we’re seeing many more sites built along these lines. Why now, what does this shift mean, and why is it important?

“Why now” is perhaps the easiest question to answer. A few short years ago, web developers only had one platform to support, and that was the “browser.” Granted, there were a dozen or so browsers of significance, and the browser world was riddled with incompatibilities. We’re in a different world now. Browser compatibilities have been ironed out, to some extent (though conscientious developers still support “legacy” browsers, all the way back to IE6 or even IE5). But it’s no news that the most important new apps these days run on devices ranging from phones (iPhone, Android, BlackBerry, Windows Phone), tablets (iPad, Android/ChromeOS devices), and potentially ebook readers and other new devices. With these new devices on the table, browser incompatibilities pale in significance. It’s another sign of the times that I can’t conceive of an interesting application that doesn’t access data across the network. A static application that never accesses remote data — that’s so 1990s.

So, while it’s tempting to say that the new age is characterized by the browser as platform, and that applications running in the browser can do anything that native code can do, that’s looking in the wrong place. HTML5 certainly ups the ante, as far as browser capabilities — and is supported to some extent by all of the other devices we’re concerned with. But the real meaning and importance of this architectural shift is on the server side, driven both by the need to support many heterogenous device and application types, and by the primacy of live data in modern applications.

In the browser-dominated world, static content and data were inevitably mixed. Yes, we had templating systems that let developers separate static content and design elements from data. But once the application server did its magic, what was delivered to the browser was HTML pages mixing data with other content. Browsers were similar enough that, with some browser detection hacks on both the server and client side, it was relatively easy (though a pain) to generate pages that would run anywhere. That doesn’t work any more. It’s naive to think that you can wrap some HTML around data and be done with the job; the chances are that you’re leaving a huge chunk of your human audience behind, and making things more difficult for another audience — machines that just want to consume your data. To build a modern application, developers must focus on the data: they must see themselves as data providers, they must develop documented and stable public APIs for accessing their data. Over the past few years, we’ve realized the importance of data. What’s the value of Google without the data behind it? Or Facebook? Or, going back 15 or so years, GNN? It took a long time for us to understand the importance of data, as opposed to “content.” But when you’ve gotten that lesson, your design goals change: designing and publishing a stable API to a data service becomes the highest priority.

That’s the driving force behind this architectural shift. Front ends, user interfaces, clients, apps, whatever you decide to call them, don’t disappear. But we have learned how important it is to keep the data interface separate from the user interface. Your next project will probably have multiple front ends, some delivered through HTML5, and some delivered through native code. Building them on a common data API is going to be much cleaner and simpler. In addition, third parties can build their own apps on top of your API. An important component of Twitter’s success has been the ecosystem of applications that have built on their data: TweetDeck, TweetGrid, Tweetie, etc.; Twitdom, the Twitter applications directory, lists more than 1,800 apps. Until the “new Twitter” went live, the Twitter website was significantly less capable than most of the third-party apps.

Although it has been a long time coming, we’re finally finishing the revolution that started with Ajax. Get data that users care about, make it available via an API, provide a data presence that’s distinct from your HTML-based web presence, and build multiple front ends to serve your customers, on whatever platforms they care about. Your value is in the data.


tags: , , ,
  • Good article, but I don’t think we’re at the end of the revolution. I don’t even believe we’re at the end of the beginning, let alone the beginning of the end.

    The world you describe is predominantly read-only and is useful primarily for creating mash-ups. Yes, you can post to Twitter or Facebook through their respective API’s, but you must use a specific data format for each API. There is no normalization, no indirection. Copies of the same data abound, locked in different formats, frequently out of the true owner’s control.

    We can start talking about having truly arrived in a world of free-flowing data when users will be able to host their information only once in a place of their choosing, with full access permissions, and selectively grant other users (regardless of which API they employ) granular access to this information.

    So, I need to update my mobile phone number exactly once. People with access to this information (regardless of what address book, social network, or any other platform they use) will not hold a copy, they will hold a pointer to the one single instance of this data – which is controlled by me.

    And each API will will be able to discover in which format the number is stored, and adapt accordingly.

    Loosely-coupled sets containing data in heterogeneous formats, controlled by users and merely relayed by APIs.

    That sounds a bit more revolutionary to me.

    • Excellent points. I agree, we’re at the beginning, rather than the end.

  • Alexandre Gandra

    You can deliver HTML, XML or JSON by content negotiation, because it is part of REST specification. In this sense, even HTML is data, if you asked so.

    I totally agree that data and REST are involved on the next big steps of web apps. And I would love to see a combination of hypermedia and adaptive software. That is RESTful in all his glory!

  • Mmmm…. seminal!

  • This is the future of web based architecture, and as the number of client devices and client apps increases, it will be more and more important to have separate data and ui tiers.

  • I couldn’t agree more on the need of separating the data interface from the user interface in the new web ecosystem of wider set of devices acting as platforms with many more senses than in the past (see, hear, talk, touch, geolocation, orientation-equilibrium, connection…).

    Sometimes, even data providers seem to forget this when they are focused on web publishing. Working on official statistics myself, I have many times stressed that APIs = freedom = wider dissemination (“Statistical dissemination 2.0?” 2009) and that providing “a data presence that’s distinct from your HTML-based web presence” (as you put it) is not only the path to open data but the only way to have a role in this complex ecosystem (“La difusión estadística y la apertura de datos gubernamentales” 2010 presentation with some text in Spanish but perfectly understandable).

  • Microsoft invented AJAX in 1999. Microsoft and many other sites were already using Javascript and XHR to create “AJAX” apps long before it had this name. In fact, I seem to remember several other names being used at the time, before AJAX stuck.

    In the same way, Twitter (and hello, Google Instant) and many other sites have been sending more JSON data than markup for the past several years. Eventually this style of programming will get a catchy phrase too, but hopefully the credit won’t go to the person who merely coins the term.

  • We used to build everything on the server side and send it to a client. It merely had to render a big slab of HTML.

    Now the stack is splitting: thicker clients and thinner servers. Business logic and user interaction can now be done on the client. (Witness the rise of client-side MVC frameworks).

    And with localStorage etc, devices & browsers can go offline, but still let the user interact with the app – perhaps changing information that, upon connection, will need to be reflected back on the server.

    This will increasingly make the hardest part of web development that of data synchronization. And it could be very hard!

  • This will help many people understand how ajax functions.
    This helped me understand how newTwitter works.

    I agree with David Semeria about the lifecycle of this transformation and would like to state that inherently it will never end. First compatibility issues were related to browsers, then plugins, now API and tomorrow it will be something else.

    PHP, CSS and in turn, XHTML are great, but they have a long way to come. I guess Java and HTML5 is the future. Is everything turning into C?

  • Love this post – I’ve been trying to articulate this for quite some time but you nail it.

    Delivering layout/structure/style+data in large chunks together made sense for where the web was at in 95-2000, then it made sense to decouple style from structure (css), the decouple structure from data on the server side (XSLT – well never mind), now we at total seperation of data from structure and style + breaking down those large chunks into small pieces.

  • I remember the nightmare that SOAP was. Brrr. What a horrible fad.

    What we need now is a little more competition for JavaScript. I think that a new programming language for the browser would be good for competition and innovation.