Toward Explicit State

Minimize assumptions, maximize testability

Model the flow of data instead of the logic of the program? That’s crazy! How can you encapsulate anything that way?

My piece on flow-based programming set off a lot of conversations, notably at Slashdot and Reddit. Many of them were hostile, wondering how such a different model could possibly work. Not all, but a lot, were the right kind of productive skepticism.

The reason this makes sense to me isn’t my time programming. My time in markup makes it seem much more sensible. While there are ways to incorporate content by reference (entities) in markup, parsers typically flatten those references into a single tree that is “the document”. Markup processing certainly can combine document transformations with data picked up from some other aspect of the program, but clean transformations are pretty ordinary.

In my broader programming experience, clean transformations are rare. Most of the code that I’ve written is all about creating logical components (often mixed with data) that communicate with other components. Data flows in fragments, and comes and goes from databases, other processing, and whatever structures seem convenient. Testing pieces in isolation is difficult, and worse, not always meaningful. If tests only apply when a program or process is in one state but not another, gaps will grow and test suites will expand perpetually.

The path I see forward – whether in functional programming, flow-based programming, or other attempts at sanity – is making state explicit at every step. This means structuring flows of information so that pure transformations are separated from side effects. Transformations take all the data they need directly as input, and generate only resulting data as output. Other functions (or processes, or whatever name they have in your system) handle all the interactions that have side effects. “Side effects” includes a lot:

  • Waiting for and receiving input
  • Putting credentials together with input
  • Reading and writing databases and files
  • Multiplexing results to more than one other process
  • Everything else I’m forgetting

This is a huge step away from the MVC implementation I’ve grown used to in Rails, but I suspect that it will let me test components more thoroughly. Perhaps it has less use in my JavaScript, which tends to be pure response to user events generating side effects that drive the interface.

Though it’s different, it’s not hard to imagine how this works in a synchronous environment, in which errors are (fairly) easily transmitted back to the originator of the data that didn’t work. I need to spend some time playing with asynchronous approaches, distributed and local, to see what it might feel like in those less tightly wound environments.

Time to write some code!

tags: , ,

Get the O’Reilly Programming Newsletter

Weekly insight from industry insiders. Plus exclusive content and offers.