• Print

The principle of indirection

We all need to know the difference between pass by value and pass by reference.

Programmers learn, early on, that there’s a difference between values stored in memory and pointers to (or references to, or addresses of) values stored in memory. The key distinction emerges when these things move around within programs, and it is captured by a pair of phrases: pass by value and pass by reference.

Suppose the value stored in some memory location represents the number 6. In a pass-by-value regime, 6 is copied from one part of the program to another part of the program. If the value stored in the original memory location then becomes 8, those parts of the program that got copies of 6 still represent 6. The 6 was “passed by value.”

In a pass-by-reference regime, though, what’s copied from one part of the program to another isn’t the value, 6, but rather a reference, or pointer, to the memory location where 6 is stored. In this case, if the value stored in that memory location becomes 8, the parts of the program that received references to 6′s memory location now represent 8 too. The 6 was “passed by reference” and, by way of that reference, has become 8.

It used to be that nobody except programmers had to appreciate this subtle distinction. But along came the web, and now everybody does. Why? Another name for a pointer, or a reference, or an address, is a hyperlink. We use hyperlinks every day. But most people don’t use them as well as they could, because most people don’t see “pass by value” and “pass by reference” at work in our everyday online discourse.

Here’s a quiz that looks easy, and should be, but turns out to be quite hard for most people. Somebody asks you: “What information do you have about topic X?” It’s a multiple-choice quiz. There are two ways to answer:

1. Make a list of things, and send a copy of the list.

2. Make a list of things, and send a reference to the list.

Most people choose 1 — that is, pass by value. The value they send, in this case, is a list of things we can describe using words, phrases, sentences, paragraphs, URLs. The list can be printed on paper and delivered by hand. Or it can be typed and sent as an email or text message. Either way, what’s sent is a copy of the list. The original list remains in situ. When it changes, over time, those changes don’t propagate through the network of copies.

The minority who choose 2 — that is, to pass by reference — achieve the same two goals as do the pass-by-value majority. One goal is to send a social signal: “Here is information I want to give you.” The other is to convey the actual information. You get these same two effects no matter whether you pass by value or pass by reference.

When I send you a link to the list, though, instead of a copy of the list, I connect you to a live list that provides four extra benefits:

1. I am the authoritative source for the list. It lives at a location in memory (that is, at a URL in the cloud) that’s under my control, and is bound to my identity.

2. The list is always up-to-date. When I add items, you (and everyone else) will see a freshly-updated list when you follow the link I sent you.

3. The list is social. If other people cite my link, I can find their citations and connect with them.

4. The list is collaborative. Suppose you want to extend my list. In a pass-by-value world, the best you can do is add to the copy I sent you. I won’t see what you’ve added, and neither will anybody else. In a pass-by-reference world, though, we can both keep our own lists, publish references to them, and then produce a merged list by combining the referents.

(Of course there’s no free lunch. If you depend on the link and it fails, we’re out of luck. This week’s companion piece at answers.oreilly.com explores one way to handle transient failure.)

The fourth benefit, the collaborative one, is rather abstract. So let’s nail it down to a common real-world scenario. Suppose you’re running a newspaper, or a hyperlocal website, or some other nexus for community information. And suppose I am a source for that information. Almost always, as things stand today, you’ll ask me to pass information to you by value. If I’m promoting a council meeting, or a church supper, or a riverside cleanup, or an open mic night, you’ll expect me to inform you about my event’s date, time, and description by sending you an email, or by visiting your website and typing the data into a form. Either way, it boils down to: “Give me a copy of your information.”

Before 1994 there was no alternative. My original, whether it was a piece of paper in my drawer or a file on my computer’s hard drive, wasn’t immediately available to you. It could only be passed by value. Since 1994 we’ve had an exciting new option, albeit one the world mostly hasn’t yet caught up to. Now the original can reside on the web, at a permanent and well-known address within its vast memory. And it can be passed by reference.

So providers of information about community events — the city government, the church, the environmental group, the musicians — can post references to information about their events. Those references can appear wherever the providers choose to establish their online identities: on conventional websites, on blogs, on Twitter, on Facebook. Purveyors of that information — newspapers, hyperlocal websites, other nexuses — can use those references to create views that join many sources, from many perspectives, for many purposes.

That’s still a notch too abstract so let’s make it even more concrete. City governments provide calendars of council and committee meetings. Local newspapers purvey those calendars. Citizens use them. In the prevailing pass-by-value model, the city gives copies of its event information to the newspaper, which in turn makes copies to give to citizens, who in turn may need to make more copies to pass around. Where’s the original? In a document on a computer at city hall.

In a pass-by-reference world, the original resides in the cloud at a unique URL. That URL refers to a list of events. And each item on the list — each event on the calendar — has its own URL. The city publishes its calendar on its own website, in HTML, so citizens can read it there. But instead of giving the newspaper copies of event information, it gives the newspaper a link to the calendar’s feed. The newspaper, by subscribing to the link, ensures that the information it receives from the city is as timely, accurate, and complete as the city cares to make it. Of course the newspaper still has to make copies for its print version. But online, along with the subset of facts about each event that it chooses to relay, it provides the event’s URL. Citizens can click through the event URL to see the whole description, and to check for updates. Citizens can also subscribe directly to the city’s calendar URL, and thus merge its stream of civic event data with their own streams of personal event data.

I’ve yet to convince a local newspaper to adopt this model. It could be that they fear disintermediation. After all, if citizens can subscribe directly to calendar feeds, why will they need the newspaper to tell them about what’s going on? But I don’t think that’s the real problem. There will always be community attention hubs. Newspapers, or whatever they evolve into, will continue to occupy that niche. In their role as purveyors of community information, though, pass-by-value makes them less effective than pass-by-reference could.

The real problem, I think, is that if you’re a newspaper editor, or a city official, or a citizen, pass-by-reference just isn’t part of your mental toolkit. We teach the principle of indirection to programmers. But until recently there was no obvious need to teach it to everybody else, so we don’t.

I’ve noticed that educators do, nowadays, talk a lot about about systems thinking and digital literacy and 21st-century skills. Good! Now let’s codify what we mean. Networks of people and data are governed by principles as basic as the commutative law of addition and multiplication. Indirection is one of those principles. Others include pub/sub syndication, universal naming, and data structure. First we need to write them down. Then we need to figure out how to teach them.

Related:

tags: , ,
  • Matt Rose

    Just being pedantic, but Unique URL is redundant, as the “U” in URL stands for Unique, it’s like saying PIN number, or NIC card

  • Dan

    Umm, the U in URL stands for uniform, not unique.

  • Alex Tolley

    There are other reasons that could be relevant. If a newspaper massages the city calendar data, it is clearly easier to do that on an ad hoc basis starting with pass by content, than it is to transform that data from the URL. That doesn’t preclude giving the city URL too, but it could explain why content producers don’t do that as the default.

    In another context the analogy is data->model->view. How many programmers go to the trouble of creating the model?

    • http://radar.oreilly.com/jonu Jon Udell

      @Alex: “If a newspaper massages the city calendar data, it is clearly easier to do that on an ad hoc basis starting with pass by content, than it is to transform that data from the URL.”

      Not clear to me. The ease will depend on how the content is structured, not how it is passed. And that’s another principle to explore in the series!

      “How many programmers go to the trouble of creating the model?”

      Great point. Invent from scratch: few. Inherit from a framework: many. I’m suggesting that we build — and teach — a conceptual framework.

  • http://schuetzengasse24.de Vasily

    Thank you very much for the great article. It helps me a lot in my studies.

  • Rob Grondin

    Thanks for these insightful articles Jon. For quite a while now I have been using URL’s linking to documents and information for precisely the reasons you outline in the article, but I had never thought of this as ‘pass by reference’. Thanks for the mind-flip.

    Cheers,

    Rob

  • http://www.linkedin.com/in/robertdavidson Robert Davidson

    Perhaps newspapers and others are concerned about permanence vs. impermanence. If I pass by value, I “own” the information and it only changes when I change it. If I pass by reference, I don’t necessarily have control over change. This also brings up the need I think for “snap shots” of the data so I can look at change as a function of time. It’s this recording aspect that is also an important part of what newspapers are for.