"json" entries

Squeaky Clean Ajax and Comet with Lift

Focus on application development, not the plumbing

Lift is a web framework for Scala, and is probably best known for having great Comet and Ajax support.

I’ve been touring the features of Lift that I find appealing. Initially I looked at designer-friendly templates and REST services. Recently, I highlighted the great features for organising and controlling access to content.

Let’s now take a look at the Ajax and Comet features of Lift.

Ajax

When I think of Ajax, I think of interacting with a web server, but avoiding page reloads.

An example: suppose we had a site that allowed you to write poems, and a feature on that site might be a button you could click to get an associated word from a thesaurus. The thesaurus we have is large, we don’t want it loaded in the browser, so we need to call the server to use it.

The HTML template would be a field and a button:

<div data-lift="Associate">
  <input id="word" type="text" placeholder="Type a word">
  <input type="submit" name="submit">
</div>

So far, that’s probably similar across many web frameworks. What you might now expect is for us to define a REST endpoint, do some jQuery wiring to connect the service to the button.

In Lift, we can do that, but often the Ajax goodness comes via this kind of Scala code:

class Associate {
  def render =
    "@submit [onclick]" #> ajaxCall(
      ValById("word"),
      w => SetValById("word", Thesaurus.association(w))
    )
}

This Associate snippet is binding the submit button’s click event to a Lift Ajax call. That is, the left side of the replace function (#>) is a CSS selector targeting the “onclick”, and the right side is arranging for the Ajax call to hit the server.

The first parameter to ajaxCall is the value passed from the browser to the server. We’ve asked for the value of the input field with an ID of “word”. The second parameter is the function to run when the button is pressed. Here we’re taking the word we’ve been sent, and updating the field with whatever our Thesaurus.association function gives us back.
Read more…

Scaling People, Process, and Technology with Python

OSCON 2013 Speaker Series

NOTE: If you are interested in attending OSCON to check out Dave’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your registration fee.

Since 2009, I’ve been leading the optimization team at AppNexus, a real-time advertising exchange. On this exchange, advertisers participate in real-time auctions to bid on individual ad impressions. The highest bid wins the auction, and that advertiser gets to show an ad. This allows advertisers to carefully target where they advertise—maximizing the effectiveness of their advertising budget—and lets websites maximize their ad revenue.

We do these auctions often (~50 billion a day) and fast (<100 milliseconds). Not surprisingly, this creates a lot of technical challenges. One of those challenges is how to automatically maximize the value advertisers get for their marketing budgets—systematically driving consumer engagement through ad placements on particular websites, times of day, etc.—and we call this process “optimization.” The volume of data is large, and the algorithms and strategies aren’t trivial.

In order to win clients and build our business to the scale we have today, it was crucial that we build a world-class optimization system. But when I started, we didn’t have a scalable tech stack to process the terabytes of data flowing through our systems every day, and we didn't have the team to do any of the required data modeling.

People

So, we needed to hire great people fast. However, there aren’t many veterans in the advertising optimization space, and because of that, we couldn’t afford to narrow our search to only experts in Java or R or Matlab. In order to give us the largest talent pool possible to recruit from, we had to choose a tech stack that is both powerful and accessible to people with diverse experience and backgrounds. So we chose Python.

Python is easy to learn. We found that people coding in R, Matlab, Java, PHP, and even those who have never programmed before could quickly learn and get up to speed with Python. This opened us up to hiring a tremendous pool of talent who we could train in Python once they joined AppNexus. To top it off, there’s a great community for hiring engineers and the PyData community is full of programmers who specialize in modeling and automation.

Additionally, Python has great libraries for data modeling. It offers great analytical tools for analysts and quants and when combined, Pandas, IPython, and Matplotlib give you a lot of the functionality of Matlab or R. This made it easy to hire and onboard our quants and analysts who were familiar with those technologies. Even better, analysts and quants can share their analysis through the browser with IPython.

Process

Now that we had all of these wonderful employees, we needed a way to cut down the time to get them ramped up and pushing code to production.

First, we wanted to get our analysts and quants looking at and modeling data as soon as possible. We didn’t want them worrying about writing database connector code, or figuring out how to turn a cursor into a data frame. To tackle this, we built a project called Link.

Imagine you have a MySQL database. You don’t want to hardcode all of your connection information because you want to have a different config for different users, or for different environments. Link allows you to define your “environment” in a JSON config file, and then reference it in code as if it is a Python object.

 { "dbs":{
  "my_db": {
   "wrapper": "MysqlDB",
   "host": "mysql-master.123fakestreet.net",
   "password": "",
   "user": "",
   "database": ""
  }
 }}

Now, with only three lines of code you have a database connection and a data frame straight from your mysql database. This same methodology works for Vertica, Netezza, Postgres, Sqlite, etc. New “wrappers” can be added to accommodate new technologies, allowing team members to focus on modeling the data, not how to connect to all these weird data sources.

In [1]: from link import lnk
 
In [2]: my_db = lnk.dbs.my_db
 
In [3]: df = my_db.select('select * from my_table').as_dataframe()
 

Int64Index: 325 entries, 0 to 324
Data columns:
id    325 non-null values
user_id   323 non-null values
app_id   325 non-null values
name    325 non-null values
body    325 non-null values
created   324 non-null values

By having the flexibility to easily connect to new data sources and APIs, our quants were able to adapt to the evolving architectures around us, and stay focused on modeling data and creating algorithms.

Second, we wanted to minimize the amount of work it took to take an algorithm from research/prototype phase to full production scale. Luckily, with everyone working in Python, our quants, analysts, and engineers are using the same language and data processing libraries. There was no need to re-implement an R script in Java to get it out across the platform.
Read more…

The Appeal of the Lift Web Framework

The extreme end of weird (as far as web frameworks go)

Lift is one of the better-known web frameworks for Scala. Version 2.5 has just been released, so it seems like a good time to show features of Lift that I particularly like.

Lift is different from other web frameworks (in fact, I labeled it at the extreme end of weird in the first presentation I gave about it), but people who get into Lift seem to love the approach it takes. It’s productive and enjoyable, which goes well with Scala.

I’ll keep this post short. Just two things:

Transforms

You might be familiar with an MVC approach to the Web, where you have code that forwards to a view, and in that view you maybe use a little bit of mark-up to loop or display values. That’s not how it goes in Lift.

Instead, you start with the view first, and use HTML5 attributes to mark the parts of the view that need transforming. Here’s an example:

<p data-lift="Poems">
  The work of <span class="author">someone</span> includes
  <span class="title">something</span>.
</p>

That’s valid HTML5. You can view it in your browser, or edit it in Adobe Fireworks, or whatever tool you want. The only part of it that looks a little strange is the data-lift attribute. What that’s doing is naming a Lift snippet, and a snippet is just a class. It might look like this:

package code.snippet

import net.liftweb.util.Helpers._

class Poems {
  def render =
    ".author *" #> "Philip Larkin" &
    ".title *" #> "This Be The Verse"
}

What’s going on here? We’re using a nifty DSL in Lift to transform the <p> tag so that it contains the content we want. We’re saying…

Read more…

Four short links: 4 July 2013

Four short links: 4 July 2013

Model-Driven Configuration, 1,000 RSS Readers Bloom, JSON Query Language, and Doug Engelbart's Vision

  1. ansibleModel-driven configuration management, multi-node deployment/orchestration, and remote task execution system. Uses SSH by default, so no special software has to be installed on the nodes you manage. Ansible can be extended in any language.
  2. The Golden Age of RSSOne of the things I expected least to see in 2013 was that this year would mark the greatest flourishing of RSS reader applications in the decade since it first came to prominence on the web.
  3. JSONiq: the JSON Query Languageexpressive and highly optimizable language to query and update NoSQL stores. It enables developers to leverage the same productive high-level language across a variety of NoSQL products. Implemented in Zorba, an Apache-licensed virtual machine for JSONiq and XQuery queries.
  4. Bret Victor on Doug EngelbartIf you attempt to make sense of Engelbart’s design by drawing correspondences to our present-day systems, you will miss the point, because our present-day systems do not embody Engelbart’s intent. Engelbart hated our present-day systems. Poetic, articulate, and bang on the money.

A Matter of Semantics

Structuring client-server communications with hypermedia messages

Messages on the Web carry three levels of information: Structure Semantics, Protocol Semantics, and Application Semantics. No matter the implementation style, all three of these are needed for any successful communication between client and server. This threesome (S-P-A) forms the essentials of communication over distributed networks.

Most of the time, though, these levels are obscured or muddled at implementation time. For example, both Protocol Semantics (how we create valid network requests) and Application Semantics (domain names like users, customers, orders, etc) are often mixed together in conversation ("You POST new users to this URL") and both of these are usually only defined in human-readable documentation and implemented in the source code of the client application itself. In other words, the protocol-level and application-level semantics are tightly coupled. An easy way to discover this is to see if you can take the same message format and implement your API using a protocol other than HTTP (e.g. WebSockets or FTP). I illustrated this "protocol-agnostic" design pattern back in 2010 ("A RESTful Hypermedia API in Three Easy Steps").

But there is a way to keep these separate from each other and view each of these aspects in their own light. In doing so, you’ll strengthen the quality and value of your message design while increasing flexibility and choices.

Structure Semantics

Structure Semantics provides the set of rules regarding how to create a well-formed message. XML has rather simple structure semantics. JSON's rules for well-formedness are a bit more vague but reachable since JSON.parse(…) turns out to be the ultimate arbiter of such things. Determining well-formedness of other, more complex media types (HTML, Atom, HAL, Collection+JSON) is tougher, but do-able even if external validators are not always available.

Building a successful web implementation that does not contain structure semantics is difficult—and that's a good thing.

Read more…

What Kind of JavaScript Developer Are You?

Fault lines make conversation difficult

“JavaScript developer” is a description that hides tremendous diversity. While every language has a range of user skill levels, JavaScript has a remarkably fragmented community. People come to JavaScript for different reasons from different places, and this can make communication difficult. Sometimes it’s worse than that—not everyone likes everyone else.

“JavaScript developer” used to mean “web developer,” specifically a developer who spends a lot of time working in the browser. Even as JavaScript became a specialty of its own, most JavaScript developers came through a broader web practice first, learning HTML and CSS before tackling the DOM. This was my path, and is still a common one. It was reasonably easy to absorb JavaScript by example, using it as an object manipulation language before pushing into the harder corners.

Many programmers, however, started on the server-side, building code that filled templates. Server-side JavaScript existed, in the Netscape Enterprise Server, for example, but was a tiny fraction in a world dominated by Perl, Java, Python, Ruby, and ASP’s languages. Developers who spent most of their time writing code that ran on the server worked with a different set of tools and expectations, and had to shift gears as JavaScript became a more critical part of web applications. (Some of these developers generate a lot of JavaScript—JSON is, after all, JavaScript—but would prefer to think of JavaScript’s role in that as an accident.)

Another group of developers came from desktop applications, and expected more direct control over the interface. Many of these developers now understand JavaScript because the browser forces them, not because they want to.

Involuntary JavaScript creates a tremendous amount of tension around the language. Normally, the people who dislike a programming language can just work with something else. JavaScript’s browser dominance makes that hard.

Other developers, though, are much much happier and much deeper into JavaScript. The JavaScript revival that became visible with the rise of Ajax in 2005 gave the language greater credibility. Douglas Crockford’s work “Unearthing the Excellence in JavaScript,” JavaScript: The Good Parts demonstrated a powerful language lurking in a toolset many had considered trivial.

Over the past decade, JavaScript’s power and reach have grown thanks largely to a core group of developers whose focus on the language has created best practices and frameworks, while driving browser vendors to improve their implementations. Node.js emerged from this work, giving JavaScript a unique server-side framework that deeply reflects the language.

Read more…

Four short links: 22 October 2012

Four short links: 22 October 2012

JSON Tool, Technology Arts, Pentesting Kit, and Open Access Week

  1. jq — command-line tool for JSON data.
  2. GAFFTA — Gray Area Foundation For The Arts. Non-profit running workshops and building projects around technology-driven arts. (via Roger Dennis)
  3. Power Pwn — looks like a power strip, is actually chock-full of pen-testing tools, WiFi, bluetooth, and GSM. Beautifully evil. (via Jim Stogdill)
  4. Open Access Week — this week is Open Access week, raising awareness of the value of ubiquitous access to scientific publishing. (via Fabiana Kubke)

Shrinking and stretching the boundaries of markup

Doing less and more than XML.

It’s easy to forget that XML started out as a simplification process, trimming SGML into a more manageable and more parseable specification. Once XML reached a broad audience, of course, new specifications piled on top of it to create an ever-growing stack.

That stack, despite solving many problems, brings two new issues: it’s bulky, and there are a lot of problems that even that bulk can’t solve.

Two proposals at last week’s Balisage markup conference examined different approaches to working outside of the stack, though both were clearly capable of working with that stack when appropriate.

Read more…

Applying markup to complexity

The blurry line between markup and programming.

When XML exploded onto the scene, it ignited visions of magical communications, simplified document storage, and a whole new wave of application capabilities. Reality has proved calmer, with competition from JSON and other formats tackling a wide variety of problems, while the biggest of the big data problems have such volume that adding markup seems likely to create new problems.

However, at the in-progress Balisage conference, it’s clear that markup remains really good at solving a middle category of problems, where its richer structures can shine without creating headaches of volume or complication. In the past, Balisage often focused on hard problems most people didn’t yet have, but this year’s program tackles challenges that more developers are encountering as their projects grow in complexity.

Read more…

Visualizing structural change

When information has structure we can use it to see change more clearly.

Think about the records that describe the status of your health, finances, insurance policies, vehicles, and computers. If the systems that manage these records could produce timestamped JSON snapshots when indicators change, it would be much easier to find out what changed, and when.