Can We Do Better Than XML and JSON?

FtanML looks for the best of both

Today’s Balisage conference got off to a great start. After years of discussing the pros and cons of XML, HTML, JSON, SGML, and more, it was great to see Michael Kay (creator of the SAXON processor for XSLT and XQuery) take a fresh look at what a markup language should be.

Many recent efforts have been reductions. JSON was an extraction from JavaScript. XML was a simplification from SGML. MicroXML pushes simplification much further. Reductions are great for cleaning up past practice and (usually) making tools more accessible, but genuinely new features come later, if at all. The JSON and XML camps mostly stare at each other warily, and though people mix them, there’s little real “best of both worlds.”

Kay, with the benefit of time in Ftan, Switzerland and a group of students, took a look at reconciling the two approaches. Compatibility with the past was unimportant, but it had to be as good as both JSON and XML in their typed data and document worlds. The (relative) human-readability of XML remained important, as did JSON’s ease of yielding data structures.

This is, of course, risky:

…the implicit decision that we would not compromise technical quality in the interests of market acceptance. The aim was to do it right, and we would not measure success by the level of adoption.

The result was three pieces: a markup language, a schema language, and a scripting/transformation language. The markup language supports a few key types—boolean, numbers (all decimal), string, list, element, null, and rich text (also known as mixed content). The syntax uses a mix of JSON and XML symbols and conventions, offering more flexibility than JSON but a much more concise system than XML. There are no namespaces, something that cheered me immensely. One of Kay’s examples provides a good sense:

<purchaseOrder 
   orderDate="1999-10-20" 
   shipTo = <country="US" [
      <name "Alice Smith">
      <street "123 Maple Street">
      <city "Mill Valley">
      <state "CA">
      <zip 90952>
   ]>
   billTo = <country="US" [
      <name "Robert Smith">
      <street "8 Oak Avenue">
      <city "Old Town">
      <state "PA">
      <zip 95819>
   ]>
   comment = |<emph |Hurry|>, my lawn is going wild|
   items = [
      <  partNum="872-AA"
         productName="Lawnmower"
         quantity=1
         USPrice=148.95
         comment=|Confirm this is |
      >
      <  partNum="926-AA"
         productName="Baby Monitor"
         quantity=1
         USPrice=39.98
         shipDate="1999-05-21"
      >
   ]
>

FtanGram is the schema language, if you’re into such things, and is one of the simplest I’ve seen, and relatively easy for humans to read directly. The scripting language, FtanSkrit, was exciting in a different way, as it used an extra type in FtanML: functions. I keep having conversation after conversation about data flows through functions replacing object hierarchies, so FtanSkrit seems to be in exactly the right place.

(Later in the day, Dmitri Novatchev looked at programming, especially functional programming, in XPath 3.0. Functional programming is a constant and rising theme in my technical life.)

Is it too late for something new in markup? Can a study project make a difference? I’m not sure, but I’m cheering for FtanML anyway.

tags: , , , ,