Andy Oram

Being online: Your identity online--getting down to basics

by @praxagora  | +Andy Oram  | Comments: 320 December 2009

What men daily do, not knowing what they do!

(This post is the third in a series called "Being online: identity, anonymity, and all things in between.")

Previous posts in this series explored the various identifies that track you in real life. Now we can look at the traits that constitute your identity online. A little case study may show how fluid these are.

One day I drove from the Boston area a hundred miles west and logged into the wireless network provided by an Amherst coffee shop in Western Massachusetts. I visited the Yahoo! home page and noticed that I was being served news headlines from my home town. This was a bit disconcerting because I had a Yahoo! account but I wasn't logged into it. Clearly, Yahoo! still knew quite a bit about me, thanks to a cookie it had placed on my browser from previous visits.

[A cookie, in generic computer jargon, is a small piece of data that a program leaves on a system as a marker. The cookie has a special meaning that only the program understands, and can be retrieved later by the program to recall what was done earlier on the system. Browsers allow web sites to leave cookies, and preserve security by serving each cookie only to the web site that left it (we'll see in a later section how this limitation can be subverted by data gatherers).]

Among the ads I saw was one for the local newspaper in my town. Technically, it would be possible Yahoo! to pass my name to the newspaper so it could check whether I was already a subscriber. However, the Yahoo! privacy policy promises not to do this and I'm sure they don't.

As an experiment, I removed the Yahoo! cookie (it's easy to do if you hunt around in your browser's Options or Preferences menu) and revisited the Yahoo! home page. This time, news headlines for Western Massachusetts were displayed. Yahoo! had no idea who I was, but knew I was logging in from an Internet service provider (ISP) in or near Amherst.

What Yahoo! had on me was a minimal Internet identity: an IP address provided by the Internet Protocol. These addresses, which usually appear in human-readable form as four numbers like 150.0.20.1, bear no intrinsic geographic association. But they are handed out in a hierarchical fashion, which allows a pretty good match-up with location. At the top of the address allocation system stand five registries that cover areas the size of continents. These give out huge blocks of addresses to smaller regions, which further subdivide the blocks of addresses and give them out on a smaller and smaller scale, until local organizations get ranges of addresses for their own use.

Yahoo! simply had to look up the ISP associated with my particular IP address to determine I was in Western Massachusetts. But the technology is a bit more complicated than that. I was actually associated with three IP addresses--a complexity that shows how the fuzziness of identity on the Internet extends even to the lowest technological levels.

First, when I logged in to the coffee shop's wireless hub, it gave me a randomly chosen IP address that was meaningful only on its own local network. In other words, this IP address could be used only by the hub and anyone logged in to the hub.

The hub used an aged but still vigorous technology known as Network Address Translation to send data from my system out to its ISP. As my traffic emanated from the coffee shop, it bore a new address associated with the coffee shop's wireless hub, not with me personally. All the people in the coffee shop can share a single address, because the hub associates other unique identifiers--port numbers--with our different streams of traffic.

But the ISP treats the coffee shop as the coffee shop treats me. The coffee shop's own address is itself a temporary address that is meaningful to the local network run by the ISP. A second translation occurs to give my traffic an identity associated with the ISP. This third address, finally, is meaningful on a world scale. It is the only one of the three addresses seen by Yahoo!.

However, an investigator (hopefully after getting a subpoena) could ask an ISP for the identity of any of its customers, submitting the global IP address and port numbers along with the date and time of access. The coffee shop didn't require any personal information before logging me in and therefore could not fulfill an investigator's request, but a person doing illegal file transfers or other socially disapproved activity from a home or office would be known to the hub system and could therefore by identified--so long as logfiles with this information had not been deleted from the hub.

The combination of IP address, port numbers, and date and time allows the Recording Industry Association of America to catch people who offer copyrighted music without authorization. And this technological mechanism underlies the European Union requirement for ISPs to keep the information they log about customer use, as mentioned in the first section of this article.

If I want to hide this minimal Internet identity--the IP address--I have to use another Internet account as a proxy. In the case of my visit to Western Massachusetts, I was protected by logging in anonymously to a coffee shop, but in some countries I'd be required to use a credit card to gain access, and therefore to bind all my web surfing to a strong real-world identity. Many European countries require this form of identification, outlawing open wireless networks.

To generalize from my Amherst experiment, the information we provide as we use the Internet is very limited, and can be limited even further through simple measures such as removing cookies (a topic covered further in a later section of this article). But what the Internet still allows can be used in a supple manner to respond instantly with ads and other material--such as the nearest coffee shop or geographically relevant weather reports--that are hopefully of greater value than the corresponding material in print publications we peruse.

This post has explored the use of IP addresses metaphorically, as well as illustratively, to show how our Internet identity is context-sensitive and can change utterly from one setting to another. Usually, we provide more of a handle to the people we communicate with over email, instant messaging, forums, and so forth. Here too we have multiple identities and spend hours collecting each other's handles.

Email, the oldest form of personal online communication, ironically has one of the better hacks for combining identities. You email accounts can be set up to forward mail, so that mail to the address you kept from your alma mater goes automatically to your work address.

In contrast, you can't use your AIM instant message account to contact someone on MSN, so you need a separate account on each IM service and no one will know they all represent you unless you tell them. Twitter is experimenting with ways to assure users that accounts with well-known names are truly associated with the people after which they're named.

If IM services all agreed to use XMPP (or some other protocol) you could reduce all your IM accounts to one. And if every social network supported OpenSocial, you could do a lot of networking while maintaining an account on just one service.

A widely adopted protocol called OpenID allows one identity to support another: if you have an account on Yahoo! or Blogger you can use it to back up your assertion of identity on another site that accepts their OpenID tokens. OpenID and related technologies such as Information Card don't validate your existence or authenticate the personal traits you have outside the Internet, but allow the identity you've built up on one site to be transferable.

My next post shows how the minimal elements of online identity have been expanded by advertisers and other companies, who combine the various retrievable polyps of our identity. Following that, we'll see how we ourselves manipulate our identities and forge new ones.

The posts in "Being online: identity, anonymity, and all things in between" are:

  1. Introduction
  2. Being online: Your identity in real life--what people know
  3. Your identity online: getting down to basics (this post)
  4. Your identity to advertisers: it's not all about you
  5. What you say about yourself, or selves
  6. Forged identities and non-identities
  7. Group identities and social network identities
  8. Conclusion: identity narratives

Comments: 3

Brian Kissel [20 December 2009 07:32 PM]

Hopefully when you cover OpenID in more detail, you'll have a look at the 2009 year in review summary at http://openid.net/2009/12/16/openid-2009-year-in-review/

Also, for any of your readers wanting to deploy OpenID (Google, Yahoo, MySpace, AOL, etc.), Facebook, Twitter, and Windows LiveID for registration and login on their websites, they may want to check out http://rpxnow.com

Keith Dennis [21 December 2009 09:56 AM]

If your readers are interested in a low-friction alternative for online identity verification - not a competitor to OpenID but a simple solution to the challenge user's of social sites such as Facebook face: how to know (with confidence) who you are really interacting with, they may want to try the free service offered by my company AssertID.

www.assertid.com

Keith

Gracey [ 2 April 2010 01:09 PM]

I myself do not feel too comfortable divulging unknowingly my personal information all over the Internet every time I go online so I use Anonymizer. Call me paranoid but I don't want to take any chances. I'll check out OpenID and AssertID and see how I can utilize these to protect my online identity.