• Print

iPhone tracking: The day after

Analysis and criticism came in the wake of our iPhone tracking story.

Update, 4/27/11 — Apple has posted a response to questions raised in this report and others.

By Alasdair Allan and Pete Warden

iPhone trackI don’t think either of us were expecting to see this story strike such a nerve. There’s been some amazing detective work from researchers across the web, and so here’s a selection of the most interesting immediate reactions.

Alex Levinson — Right from launch, we had an FAQ pointing to articles by people like Ryan Neal and Paul Courbis who had found this file (consolidated.db) before, but hadn’t understood or been able to communicate its significance. The main reason we went public with this was exactly because it already seemed to be an open secret among people who make their living doing forensic phone analysis, but not among the general public — even pretty geeky people like Alasdair and me. We were freaked out by the implications of this data and how unprotected it was, but most of the forensics community seemed to miss quite how creepy ordinary people would find it.

I do appreciate how frustrating this must be for Alex though, and would like to apologize personally to him that we didn’t include his article among the prior research we cited. Unlike the others, it didn’t show up in web searches or the books we referenced. It also didn’t help that most of the follow-up articles by other people left out the details that we’d tried to make clear about who found it first. We obviously didn’t communicate it as well as we thought we had, which is completely our fault.

My Life According to the iPhone’s Secret Tracking Log — Alexis Madrigal has a far more interesting life than me, judging by his map. I especially like the points from a flight with Jim Fallows somewhere over West Virginia. As he says, this data can be incredibly interesting, and as data geeks we were just as fascinated as he is. I actually look forward to a future where we can use this sort of information, but with the user’s permission.

Apple is not “recording your moves” — Both of us have been following Will Clarke’s blog for a while and we liked this article. It’s good to look skeptically at the accuracy of the data both in space and time. We do disagree about one of the conclusions though: that the points are just the locations of cell towers. That was one of our first thoughts when we saw the data. But the fact that there’s thousands of different points scattered across small areas, all in slightly different places, seems like pretty strong evidence that they’re not just the locations of cell towers. Another way of putting that is that there’s a lot more points than there are towers. There’s also lots of points with the same tower ID code that are in different locations. That all led to our conclusion that it was trying to figure out the device’s position, even if it wasn’t very good at it.

Until we get a deeper analysis, that’s just a provisional conclusion of course. But getting smart folks like Will to dig into this and correct anything we’ve got wrong is exactly why we open-sourced it. He also picks up on the Las Vegas Anomaly. Multiple people have reported seeing a phantom trip to the city show up, and one theory (other than a lot of lost weekends) is that Apple has an unpacking or testing facility there. Alasdair’s phone that was shipped with iOS4 shows this, whereas my older device that originally had iOS3 doesn’t, which was suggestive. I wonder if Will’s device is a newer one, too?

OpenStreetMap — The application we released relies on this volunteer-run site to render the background map tiles. We ended up tripling their usual load, according to a team member. They actually fired up extra servers to cope, so I made sure to add a link to their donation page from our main site. If you got something out of the application, please do consider giving something to them, or even getting involved. It’s a fantastic team and community. How many other organizations would have responded to heavy usage by a free client by paying for more servers themselves? I even messed up their credit text on the initial version of the application, but they were very understanding about that too.

Related:

tags: , , ,
  • Erech Apayim

    Do you know if anyone is porting the up to MS Windows? Thanks.

  • http://radar.oreilly.com/aallan/index.html Alasdair Allan

    There have actually been a couple of ports to MS Windows that have shown up in the tweet stream over the last couple of days. However while I’ve got absolutely no reason to suspect that the authors of the apps are doing anything nefarious with the data, I haven’t seen the source code for these ports. I therefore can’t speak to what they’re doing with their user’s data, so I’m not going to point people at them. Sorry about that.

  • blugger

    >

    Did you ever had the thought that not only cell towers transmit signals, but also WIFI towers and local routers do? POI also has a location for sure, so all that is recorded is probably the traffic based on where the device is (included mr average joe wifi router in his home/office/apartment complex).

    And it would make totally sense, considering what you do with the phone when you use maps and have WIFI and 3G active at the same time, or when you use any of the other services that requires you to be connected to the network.

    Now if this file was encrypted and nobody could see what is inside, the problem would not even exist…maybe every single device is doing the same since years and we never realized it :)

  • Ian

    I’ve been trying to find out what the spots mean but not sure if I’ll get a reply on Twitter as there is a lot of attention around this. I read the application FAQ, but the best I can find is a reporters assumption that bigger and darker means a location has been visited more.

    I don’t live in London and visit only every few months and I think I’ve never been to East London, but by far the biggest spot is London. Birmingham where I also do not live and have not visited more than a once since the tracking started has the second largest concentration of spots. There are also smaller spots suggesting I’ve visited places that I think I have never been to, definitely not in the past few years at least.

    Thanks for your help.

  • Gabriel Lee

    “most of the forensics community seemed to miss quite how creepy ordinary people would find it.”

    Maybe because you made the FAQ and original story sound SO MUCH MORE CREEPY than it actually is? For what, less than a day worth of “fame”?

    I’m actually ashamed to be from the same country as you.

    I’m also extremely disappointed at O’Reilly (which you seem to represent) at how this was reported. I regularly review books for the academic market, but from now on I’ll just tell ORA not to bother sending me more.

    The quality was already going downhill anyway when they set full sail onwards into the Web 2.0 marketing rubbish.

  • Jim

    Thanks for making a mountain out of a mole hill guys. I hope both you and O’Reilly are please with the website traffic you generated. The timing was perfect too – the ay of an Apple earning calls. Security sensationalists. I too am done with O’Reilly books.

  • coldbrew

    Apparently, you made some people mad, and they’ve decided you wanted publicity and/or pageviews.

    Please don’t believe most of *us* feel that way. I, and many others I know, appreciate the light you’ve shed on this very important topic. These companies need to be more transparent about the data they are tracking and where they are storing it.

    Keep up the great work. Thanks again.

  • BillD

    It’s difficult to believe that you searched for prior work on this, and didn’t find a book titled “iOS Forensic Analysis”. It suggests that your methods aren’t comprehensive.

    Of course now, after all the media attention, even Googling (iphone+forensics) shows O’Reilly listed or linked in 4 of the 13 results listed on the first page. Including the seminar O’Reilly is producing on this topic for $3500. Good job with the link juice!

    About forensics experts not appreciating “just how creepy normal people find this”. That seemed a bit of a bait, like criticizing a surgeon for not being emotional while working on car accident victims. It’s not their job and indeed gets in the way of performing, any more than it is yours to color the data you are collecting. If you want to draw conclusions, carry them through to the end. Is it creepy that companies store location data? Or is it REALLY creepy that they turn over that data to authorities at a simple request and without warrant, and have in fact built admin front ends to do so for direct authorities access?

    Most people are aware that every credit card purchase they make, every cell phone they use, every Google Earth picture they look at involves sharing of “data”. Everyone seems to be comfortable with that. It helps the good guys track down terrorists and pedophiles and drug cartels, after all. Understandably, people start to get nervous when they see themselves in that lineup. (That was why they were told that they shouldn’t allow the government to have the power at all back when the Patriot Act was introduced, that it would be expanded and that they would lie about it like they always do. It didn’t bother them then. Oh well. )

    Since we’re on the topic of creepy collecting of data, can I presume that you will not be storing the IP address (traceable) of the machine that I typed this comment from? Just in case the DOJ requests your traffic records?

  • ath0

    Sure they made some people mad, by publishing what amounted to near lies just to get their agenda out.

    So you couldn’t find cache files in Androids? Looks like your removed that from the page now..

    You couldn’t understand that a file inside a directory called Cache can very well be a CACHE file and not part of some devious plan? Where’s the technical discussion of that?

    This isn’t research, it’s weekend curiosity or popular journalism at best.

    Security and privacy research should be left to those who can have a rational discussion about it, not these two who probably never even picked up a security book in their lives.

    Otherwise we’ll just end up with a situation like the TSA paranoia.

  • http://HockeyBias.com HockeyBias

    It will be interesting to se ehow Apple finally speaks to this.

  • Tom

    If you had a huge ‘cloud’ of memory, and saved all your digital Sat data…+ your iphone file and presto there you are! And we paid for it! Smile all…

  • Joliy

    It seems pretty obvious that a location-using device needs to know its location, and quickly too, so that eg the camera application doesn’t have to wait for GPS satellite acquisition. Anyone who has used a hand-held GPS device will know that it can take minutes for a first acquisition, especially in built-up areas where the sky is not clear – or not at all inside a building).

    Lots of posts going back to 2007 at least discuss triangulation of phone towers when GPS is unreliable or to increase the speed of a location
    fix. Obviously this data might be useful in the future when you return to that location, so why not store the correlations of GPS and triangulation for later.

    I agree that this data ought to be secured and time-limited, after all who wants to know where it was packed, tested, etc. but then who might use this?
    Apple offer the “Find My iphone” app which claims to find the phone when it’s lost and tell you where it might be. The easy way to do that is to ask the iphone for its last location in its database! Of course you don’t HAVE to lose the phone to use this…

    Expect to be located.

    Now – how about someone out there doing an iphone app to display this data on the iphone? Or is that too silly?

  • http://www.willclarke.net Will Clarke

    That’s interesting about entries with the same tower ID and different locations. Have you looked at the horizontal accuracy of those points? I’ll do some more analysis of the data tonight. Meanwhile, take a look at these:
    http://www.willclarke.net/?p=278
    http://www.willclarke.net/?p=309

    About the Las Vegas anomaly – I use an iPhone 4 that I got on launch day, which was June 24. The timestamp on the Las Vegas data is June 25. Then again it is the first set of data written to the log since June 17th (I was using the GM of iOS 4 on my 3GS at the time). The next entry in my log isn’t until July 29th. Which means I ran around DC for five days with a brand new iPhone 4 and it didn’t log any data at all.

    To people attacking Pete and Alasdair – they are pointing out some pretty serious security implications of this consolidated.db file. While they and I disagree about what the data in the file is supposed to be, we all agree that is can be used to determine what neighborhood you are in near a certain time.

    Yes forensics people knew about this a long time ago, but they weren’t talking about it because they were too busy trying to exploit the data. The hate that this story has gotten out (and that there is now an open source program to browse it) because they were making bank with their own proprietary software. So keep that in mind.

  • LP

    How can iPhone obtain cell position without using a gsm data or wifi connection to perform a location lookup? I have such data from a recent trip when I had data roaming off.

  • jhn

    Apple is, indeed, recording your moves. This has been documented, and it will be interesting to see if the various “debunkers” update their stories in light of this evidence. Will Clarke, for instance, has mentioned in passing that is now aware that Apple collects location data, without admitting he was wrong to begin with.

    (The fact that data has been “anonymized” is meaningless. Anonymized data sets can generally be reverse-engineered back to individual users. In any event, I would say that an anonymous data set that showed an iPhone in my house every night and at my job every day is probably somehow connected to me.)

  • Tom

    Will Clarke, are you suggesting that there is a ‘Dome of Silence’ for DC?

  • John Dingler

    Does Nokia and now Microsoft’s mapping feature depend on location tracking, that is, the tracking of where the phone is currently located, thus also tracking the person, and perhaps recording the person’s private info?

    If true, then Nokia and Microsoft must have paid off American legislators and perhaps convinced them to go after Apple only, not themselves.

  • ath0

    @WIll Clarkle

    So first you confirm there is a “Las Vegas anomaly” but then go on to saying: “we all agree that is can be used to determine what neighborhood you are in near a certain time.”

    I’m sorry but if my phone is recording locations in Vegas (and others, I’m in Europe and see error locations in France) where I wasn’t at all this means the info can’t be trusted to determine the neighborhoods I’m in.

    So no I don’t agree with that statement. It just means it’s possible, not certain, you were in that area at some time, which is a completely different thing.

  • Franz

    @ath0

    While errors make it hard to ask “Where were you at 3PM?”, it’s really unlikely that outliers will yield a false positive when you ask, “Were you at 123 Sesame St. at 3PM?”

    Absolutely disagree with you when you say that security and privacy research should be left to the pros. You’re wrong. Security and privacy research are the responsibility of anyone living in a free society. Previous researchers were clearly incapable of broadcasting their findings to consumers, the group most affected by this problem.

    “You couldn’t understand that a file inside a directory called Cache can very well be a CACHE file and not part of some devious plan? Where’s the technical discussion of that?”

    No technical discussion necessary. I do not want, nor do I expect the location-aware device that I carry to maintain an extensive history of my whereabouts. Tens of thousands of records! That’s not a cache, it’s a log. There exists no reasonable explanation.

  • ath0

    @Franz
    You are the perfect illustration on why this research should not be discussed by people who don’t understand it fully, or at least the basics.

    First no, based on this data it’s not even possible to talk about precise locations. So you’ll never be able to answer either way if you were at 123 Sesame St. based on this data. FYI, a single GSM cell tower can cover up to 50 miles.

    So it’s nonsense to even suggest one can ask that.

    On your second point, as soon as you have to cache data – and in this case the device has to in order to provide quick location and not waste the battery doing it – the question of how long to cache appears.

    For the purpose of providing this location service, the longer the cache the better: it means the device has to contact much Apple less often and saves battery too.

    Question: is old location data that much more of a problem than recent data? Let’s have an example:

    Imagine I’ve go to the gym every morning for the past 3 months. In Apple’s file this may record the cell towers near the gym and with a timestamp weeks or months ago, because the data doesn’t get refreshed very often.

    If the cache was made shorter (say limited to 24 hours) it would possibly appear as an entry showing I was there today.

    Is this actually any better?

    Setting the cache to very small times, eg 1 hour would make the cache useless. It might as well not exist. (and going back to my original claim, it would waste battery and contact Apple too often)

    It’s exactly this discussion that has to me made in proper research, and which the authors, or people unaware of the details like you, completely missed.

  • ipodder

    Hello.

    You might want to know that any iPod Touch is doing the very same thing.
    Locations are stored in the same database, different table ‘WifiLocations’.

    Adds a couple’o’million redorders I’d guess.

    Also relevant to iPhones somehow ?
    I’d love to see a diff: Cell- vs. Wifi-location / same phone.

  • John Thomas

    Ok well how about a web based version so the many non-mac owners i.e. 85% of the IT market can also see whats going on? Was soo excited until i saw it was osx only, wtf is with that? hipster douches use macs. Geeks use either Nix, Unix, and MS O/S’es.

  • John Thomas Allcock

    “hipster douches use macs. Geeks use either Nix, Unix, and MS O/S’es.”

    You seem to be unaware that Mac is Unix. Sigh, beginners.

  • Gobbeldee Gooke

    Curious. This hits on earnings day for AAPL. Cue the Gizmodo halfwits on MSNBC giving jailbreak instructions to a clueless news anchor and the viewing public. Senators lose their minds and demand answers to questions that were already answered a year ago. Apple haters start foaming at the mouth and giggling like hyenas. Meanwhile, Apple is silent, having detailed this system in July for Congress. Most people click the links and stop channel surfing for a fleeting moment because they’ve seen the words Apple and iPhone but quickly move on having seen nothing exciting or nearly as creepy as the headline suggests. Life goes on. The Apple machine moves ahead, vacuuming up some more billions. People still line up for iPad 2. O’Reilly moves up in the Googles. Apple competitors slowly trudge back to the task of finding a viable competitive angle.

    I’m quite stunned that an org as respected as O’Reilly would go this route. I’ve enjoyed many books and paid dearly for conferences and will probably continue to do so. I’ve just lost a ton of trust and respect for what I thought was a trustworthy group of professionals. This was something a blogger would do and I can’t help but to stop and consider all of the coincidences in the timing and the media push that followed. Paranoia, I guess. That would just be creepy.

  • Franz

    @ath0

    I think you’re branding my disagreement as a lack of understanding.

    I was directly addressing your argument that data collection errors make location data completely untrustworthy. They don’t–errors change the kinds of questions one can answer. (A street address was indeed too fine a point to use as an example; a city block or sporting complex would’ve been better.)

    You’re now suggesting that the data collected is just cell tower location info, but that was called into question in the article itself:

    “We do disagree about one of the conclusions though: that the points are just the locations of cell towers. That was one of our first thoughts when we saw the data. But the fact that there’s thousands of different points scattered across small areas, all in slightly different places, seems like pretty strong evidence that they’re not just the locations of cell towers. … There’s also lots of points with the same tower ID code that are in different locations. That all led to our conclusion that it was trying to figure out the device’s position, even if it wasn’t very good at it.”

    From a device location privacy standpoint, the age of location data is less important than the quantity. A long capture period is going to paint a more complete picture of the user’s activity and afford less privacy. So yes, reducing a device location cache’s age by several hundred times would be much better for consumer privacy.

    From a cell tower cache standpoint, storing a very long history of infrequently used towers provides a tiny chance of user experience improvement. Competitor Google decided on a rolling cache of the last 50 cell towers as part of Android’s opt-in coarse location services (source excerpt at https://github.com/packetlss/android-locdump). I’d argue that this is far, far shorter than Apple’s cache in the best case and the same length in the worst case. Given that Android stands up technically against its competitors, I’d also say that a shorter cache strategy still gets the job done.

    I don’t think my level of understanding is the problem here.

  • http://shaunmackey.com/articles/mlsp/what-is-mlsp-my-lead-system-pro-review/ mlsp

    I’m sure iOS 5 will have retina scanning and built in RFID implant technology. One step closer (to 1984, that is).