<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>O&#039;Reilly Radar &#187; Alasdair Allan</title>
	<atom:link href="http://radar.oreilly.com/aallan/feed" rel="self" type="application/rss+xml" />
	<link>http://radar.oreilly.com</link>
	<description>Insight, analysis, and research about emerging technologies</description>
	<lastBuildDate>Fri, 17 May 2013 16:29:56 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>3D printing from your fingertips</title>
		<link>http://radar.oreilly.com/2013/02/3d-printing-from-your-fingertips.html</link>
		<comments>http://radar.oreilly.com/2013/02/3d-printing-from-your-fingertips.html#comments</comments>
		<pubDate>Thu, 21 Feb 2013 15:00:10 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[3d printers]]></category>
		<category><![CDATA[3d printing]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[emerging technology]]></category>
		<category><![CDATA[hardware startups]]></category>
		<category><![CDATA[Kickstarter]]></category>
		<category><![CDATA[maker]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=55891</guid>
		<description><![CDATA[The 3Doodler is a 3D printer, but it&#8217;s a pen. This takes 3D printing and turns it on its head. In fact the 3Doodler rejects quite a lot of what most people would consider necessary for it to be called &#8230; ]]></description>
				<content:encoded><![CDATA[<p>The <a title="3Doodler" href="http://www.kickstarter.com/projects/1351910088/3doodler-the-worlds-first-3d-printing-pen">3Doodler</a> is a 3D printer, but it&#8217;s a pen. This takes 3D printing and turns it on its head.</p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/DQWyhezIze4?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>In fact the 3Doodler rejects quite a lot of what most people would consider necessary for it to be called a 3D printer. There is no three-axis control. There is no software. You can&#8217;t <a href="http://www.thingiverse.com">download a design</a> and print an object. It strips 3D printing back to basics.</p>
<p>What there is, what it allows you to do, is make things. This is the history of printing going in reverse. It&#8217;s as if Gutenberg&#8217;s press was invented first, and then somebody came along afterwards and invented the fountain pen.<span id="more-55891"></span></p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/K3QWRI29qCc?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>While the 3Doodler looks simple, the creators have obviously overcome some serious technological difficulties to get it working. One of the things that&#8217;s hard to do on 3D printers, at least hard to do well, is unsupported structures.</p>
<p>As anyone that owns a 3D printer will tell you, the cooling time for the plastic as it leaves the print head is crucial to allow you to print unsupported structures. Too hot and it doesn&#8217;t work, the structure sags and runs. Too cold and it just plain doesn&#8217;t work at all. From their <a href="http://www.youtube.com/watch?v=DQWyhezIze4">videos</a>, the 3Doodler inventors seem to have cracked the problem. Building a free-standing structure appears to be easy and well within the capabilities of the pen.</p>
<p>It also takes 3mm ABS and PLA as its &#8220;ink,&#8221; the same stuff used by most hobbyist 3D printers. I&#8217;ve got spools of this stuff hanging around my house, which I use in <a href="https://plus.google.com/u/0/117841261693434574785/posts/WB2dLsnmrYx">my own printer</a>. But unlike my printer, which cost just under a thousand dollars, the 3Doodler costs just $75.</p>
<p>It doesn&#8217;t have the same capabilities, but that&#8217;s the difference between a printing press and a pen. It has different capabilities, ones a &#8220;normal&#8221; 3D printer doesn&#8217;t have. It&#8217;s not a cheap alternative, it&#8217;s a different thing entirely.</p>
<p>I&#8217;m currently watching the 3Doodler climb past its first million dollars <a href="http://www.kickstarter.com/projects/1351910088/3doodler-the-worlds-first-3d-printing-pen">on Kickstarter</a>. When I say its &#8220;first&#8221; million I mean that. The project has more than 30 days left on its campaign and already it&#8217;s gone viral. This is the next <a href="http://www.kickstarter.com/projects/597507018/pebble-e-paper-watch-for-iphone-and-android">Pebble</a>. The next Kickstarter success story.</p>
<p>The creators have tapped into a previously untappable market: People who wanted a 3D printer but couldn&#8217;t afford one, and people who see the obvious potential of a fountain pen over a printing press, for both art and engineering.</p>
<p>The guys behind the <a href="http://www.the3doodler.com">3Doodler</a> made $60,000 dollars while I wrote this post. My hat is off to them. It&#8217;s not often someone comes up with an idea this good.</p>
<p>I&#8217;m going to be writing a series of posts on <a href="http://radar.oreilly.com/tag/hardware-startups">hardware startups</a> over the course of the next few months, and rest assured I&#8217;ll come back to the 3Doodler. But not until they can type faster than they can make money.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2013/02/3d-printing-from-your-fingertips.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The inevitability of smart dust</title>
		<link>http://radar.oreilly.com/2013/01/the-inevitability-of-smart-dust.html</link>
		<comments>http://radar.oreilly.com/2013/01/the-inevitability-of-smart-dust.html#comments</comments>
		<pubDate>Tue, 08 Jan 2013 14:00:52 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[Future of Technology]]></category>
		<category><![CDATA[Internet of Things]]></category>
		<category><![CDATA[sensor networks]]></category>
		<category><![CDATA[sensors]]></category>
		<category><![CDATA[Smart Dust]]></category>
		<category><![CDATA[ubiquitous computing]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=54191</guid>
		<description><![CDATA[I&#8217;ve put forward my opinion that desktop computing is dead on more than one occasion, and been soundly put in my place as a result almost every time. &#8220;Of course desktop computing isn&#8217;t dead &#8212; look at the analogy you&#8217;re drawing &#8230; ]]></description>
				<content:encoded><![CDATA[<p><img src="http://s.radar.oreilly.com/wp-files/2/2013/01/0113-dust.jpg" alt="it&#039;s not fog... it&#039;s smoke... by Guilherme Jófili, on Flickr" width="300" height="264" class="alignright size-full wp-image-55030" />I&#8217;ve <a href="http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html">put forward my opinion that desktop computing is dead</a> on more than one occasion, and been soundly put in my place as a result almost every time. <em>&#8220;Of course desktop computing isn&#8217;t dead &mdash; look at the analogy you&#8217;re drawing between the so called death of the mainframe and the death of the desktop. Mainframes aren&#8217;t dead, there are still plenty of them around!&#8221;</em></p>
<p>Well, yes, that&#8217;s arguable. But most people, everyday people, don&#8217;t know that. It doesn&#8217;t matter if the paradigm survives if it&#8217;s not culturally acknowledged. Mainframe computing lives on, buried behind the scenes, backstage. As a platform it performs well, in its own niche. No doubt desktop computing is destined to live on, but similarly behind the scenes, and it&#8217;s already fading into the background.</p>
<p>The desktop will increasingly belong to niche users. Developers need them, at least for now and for the foreseeable future. But despite the prevalent view in Silicon Valley, the world does not consist of developers. Designers need screen real estate, but buttons and the entire desktop paradigm are a hack; I can foresee the day when the computing designers use will not even vaguely resemble today&#8217;s desktop machines.</p>
<p>For the rest of the world? Computing will almost inevitably diffuse out into our environment. Today&#8217;s mobile devices are transition devices, artifacts of our stage of technology progress. They too will eventually fade into their own niche. Replacement technologies, or rather user interfaces, like Google&#8217;s <a href="https://plus.google.com/+projectglass/posts">Project Glass</a> are already on the horizon, and that&#8217;s just the beginning.</p>
<p>People never wanted computers; they wanted what computers could do for them. Almost inevitably the amount computers can do for us on their own, behind our backs, is increasing. But to do that, they need data, and to get data they need sensors. So the diffusion of general purpose computing out into our environment is inevitable.<span id="more-54191"></span></p>
<p>Everyday objects are already becoming smarter. But in 10 years&#8217; time, every piece of clothing you own, every piece of jewelry, and every thing you carry with you will be measuring, weighing and calculating. In 10 years, the world &mdash; your world &mdash; will be full of sensors.</p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/9yv1_ooM-pI?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>The sensors you carry with you may well generate more data every second, both for you and about you, than previous generations did about themselves during the course of their entire lives. We will be surrounded by a cloud of data. While the phrase &#8220;data exhaust&#8221; has already entered the lexicon, we&#8217;re still essentially at the <a href="http://bit.ly/senselab">banging-the-rocks-together stage</a>. You haven&#8217;t seen anything yet &#8230;</p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/n4tGWCiaFNs?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>The end point of this evolution is already clear: it&#8217;s called <a href="http://en.wikipedia.org/wiki/Smartdust">smart dust</a>. General purpose computing, sensors, and wireless networking, all bundled up in millimeter-scale sensor motes drifting in the air currents, flecks of computing power, settling on your skin, ingested, will be <a href="http://bits.blogs.nytimes.com/2012/09/07/big-data-in-your-blood/">monitoring you inside and out</a>, sensing and reporting &mdash; both for you and about you.</p>
<p>Almost inevitably the amount of data that this sort of technology will generate will vastly exceed anything that can be filtered, and distilled, into a remote database. The phrase &#8220;data exhaust&#8221; will no longer be a figure of speech; it&#8217;ll be a literal statement. Your data will exist in a cloud, a halo of devices, tasked to provide you with sensor and computing support as you walk along, calculating constantly, consulting with each other, predicting, anticipating your needs. You&#8217;ll be surrounded by a web of distributed sensors and computing.</p>
<p>Makes desktop computing look sort of dull, doesn&#8217;t it?</p>
<p><em>Photo: <a href="http://www.flickr.com/photos/gjofili/3110044273/" title="it's not fog... it's smoke... by Guilherme Jófili, on Flickr">it&#8217;s not fog&#8230; it&#8217;s smoke&#8230; by Guilherme Jófili, on Flickr</a></em></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2012/08/flying-cars-build-future-make-diy.html">They promised us flying cars</a></li>
<li> <a href="http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html">The next, next big thing</a></li>
<li> <a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">The secret is to bang the rocks together</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2013/01/the-inevitability-of-smart-dust.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Digging into the UDID data</title>
		<link>http://radar.oreilly.com/2012/09/udid-data-analysis.html</link>
		<comments>http://radar.oreilly.com/2012/09/udid-data-analysis.html#comments</comments>
		<pubDate>Thu, 06 Sep 2012 12:54:03 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[fbi]]></category>
		<category><![CDATA[hackers]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[Leak]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[UDID]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=51919</guid>
		<description><![CDATA[Over the weekend the hacker group Antisec released one million UDID records that they claim to have obtained from an FBI laptop using a Java vulnerability. In reply the FBI stated: The FBI is aware of published reports alleging that &#8230; ]]></description>
				<content:encoded><![CDATA[<p>Over the weekend the hacker group <a href="http://en.wikipedia.org/wiki/Operation_AntiSec">Antisec</a> <a href="http://www.wired.com/threatlevel/2012/09/hackers-release-1-million-apple-device-ids-allegedly-stolen-from-fbi-laptop/">released one million UDID records</a> that they claim to have obtained from an FBI laptop using a Java vulnerability. In reply <a href="http://http://www.wired.com/threatlevel/2012/09/fbi-says-laptop-wasnt-hacked-never-possessed-file-of-apple-device-ids">the FBI stated</a>:</p>
<blockquote><p>The FBI is aware of published reports alleging that an FBI laptop was compromised and private data regarding Apple UDIDs was exposed. At this time there is no evidence indicating that an FBI laptop was compromised or that the FBI either sought or obtained this data.</p></blockquote>
<p>Of course that statement leaves a lot of leeway. It could be the agent&#8217;s personal laptop, and the data may well have been &#8220;property&#8221; of an <a href="http://www.ncfta.net">another agency</a>. The wording doesn&#8217;t even explicitly rule out the possibility that this was an agency laptop, they just say that right now they don&#8217;t have any evidence to suggest that it was.</p>
<p>This limited data release doesn&#8217;t have much impact, but the possible release of the full dataset, which is <a href="http://www.wired.com/threatlevel/2012/09/hackers-release-1-million-apple-device-ids-allegedly-stolen-from-fbi-laptop/">claimed to include</a> names, addresses, phone numbers and other identifying information, is far more worrying.</p>
<p>While there are some almost <a href="http://www.bbc.co.uk/news/technology-19491422">dismissing the issue out of hand</a>, the real issues here are: Where did the data originate? Which devices did it come from and what kind of users does this data represent? Is this data from a cross-section of the population, or a specifically targeted demographic? Does it originate within the law enforcement community, or from an external developer? What was the purpose of the data, and why was it collected?</p>
<p>With conflicting stories from all sides, the only thing we can believe is the data itself. The 40-character strings in the release <a href="http://theiphonewiki.com/wiki/index.php?title=UDID">at least look like UDID</a> numbers, and anecdotally at least <a href="https://twitter.com/peterkruse/status/242936275420717056">we have a third-party confirmation</a> that this really is valid UDID data. We therefore have to proceed at this point as if this is real data. While there is a possibility that some, most, or all of the data is falsified, that&#8217;s looking unlikely from where we&#8217;re standing standing at the moment.</p>
<p><span id="more-51919"></span>With that as the backdrop, the first action I took was to check the released data for my own devices and those of family members. Of the nine iPhones, iPads and iPod Touch devices kicking around my house, none of the UDIDs are in the leaked database. Of course there isn&#8217;t anything to say that they aren&#8217;t amongst the other 11 million UDIDs that haven&#8217;t been released.</p>
<p>With that done, I broke down the distribution of leaked UDID numbers by device type. Interestingly, considering the number of iPhones in circulation compared to the number of iPads, the bulk of the UDIDs were self-identified as originating on an iPad.</p>
<div id="attachment_51920" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/2012/09/udid-data-analysis.html/screen-shot-2012-09-05-at-15-29-23" rel="attachment wp-att-51920"><img class="wp-image-51920 " src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-05-at-15.29.23-620x341.png" alt="" width="558" height="307" /></a><p class="wp-caption-text">Distribution of UDID by device type</p></div>
<p>What does that mean? Here&#8217;s one theory: If the leak originated from a developer <a href="http://arstechnica.com/apple/2012/09/apple-denies-giving-ios-device-identifier-list-to-fbi/">rather than directly from Apple</a>, and assuming that this subset of data is a good cross-section on the total population, and assuming that the leaked data originated with a single application &#8230; then the app that harvested the data is likely a Universal application (one that runs on both the iPhone and the iPad) that is mostly used on the iPad rather than on the iPhone.</p>
<p>The very low numbers of iPod Touch users might suggest either demographic information, or that the application is not widely used by younger users who are the target demographic for the iPod Touch, or alternatively perhaps that the application is most useful when a cellular data connection is present.</p>
<p>The next thing to look at, as the only field with unconstrained text, was the Device Name data. That particular field contains a lot of first names, e.g. &#8220;Aaron&#8217;s iPhone,&#8221; so roughly speaking the distribution of first letters in the this field should give a decent clue as to the geographical region of origin of the leaked list of UDIDs. This distribution is of course going to be different depending on the predominant language in the region.</p>
<div id="attachment_51951" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/2012/09/udid-data-analysis.html/screen-shot-2012-09-05-at-16-41-08-2" rel="attachment wp-att-51951"><img class=" wp-image-51951 " src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-05-at-16.41.081-620x319.png" alt="" width="558" height="287" /></a><p class="wp-caption-text">Distribution of UDID by the first letter of the &#8220;Device Name&#8221; field</p></div>
<p>The immediate stand out from this distribution is the predominance of device name strings starting with the letter &#8220;i.&#8221; This can be ascribed to people who don&#8217;t have their own name prepended to the Device Name string, and have named their device &#8220;iPhone,&#8221; &#8220;iPad&#8221; or &#8220;iPod Touch.&#8221;</p>
<p>The obvious next step was to compare this distribution with the relative frequency of first letters in words in the English language.</p>
<div id="attachment_51956" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/2012/09/udid-data-analysis.html/screen-shot-2012-09-05-at-17-43-36" rel="attachment wp-att-51956"><img class=" wp-image-51956 " src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-05-at-17.43.36-620x339.png" alt="" width="558" height="305" /></a><p class="wp-caption-text">Comparing the distribution of UDID by first letter of the &#8220;Device Name&#8221; field against the relative frequencies of the first letters of a word in the English language</p></div>
<p>The spike for the letter &#8220;i&#8221; dominated the data, so the next step was to do some rough and ready data cleaning.</p>
<p>I dropped all the Device Name strings that started with the string &#8220;iP.&#8221; That cleaned out all those devices named &#8220;iPhone,&#8221; &#8220;iPad&#8221; and &#8220;iPod Touch.&#8221; Doing that brought the number of device names starting with an &#8220;i&#8221; down from 159,925 to just 13,337. That&#8217;s a bit more reasonable.</p>
<div id="attachment_51955" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/2012/09/udid-data-analysis.html/screen-shot-2012-09-05-at-17-27-25" rel="attachment wp-att-51955"><img class=" wp-image-51955 " src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-05-at-17.27.25-620x335.png" alt="" width="558" height="302" /></a><p class="wp-caption-text">Comparing the distribution of UDID by first letter of the &#8220;Device Name&#8221; field, ignoring all names that start with the string &#8220;iP,&#8221; against the relative frequencies of the first letters of a word in the English language</p></div>
<p>I had a slight over-abundance of &#8220;j,&#8221; although that might not be statistically significant. However, the stand out was that there was a serious under-abundance of strings starting with the letter &#8220;t,&#8221; which is interesting. Additionally, with my earlier data cleaning I also had a slight under-abundance of &#8220;i,&#8221; which suggested I may have been too enthusiastic about cleaning the data.</p>
<p>Looking at the <a href="http://en.wikipedia.org/wiki/Letter_frequency#Relative_frequencies_of_letters_in_other_languages">relative frequency of letters in languages other than English</a> it&#8217;s notable that amongst them Spanish has a much lower frequency of the use of &#8220;t.&#8221;</p>
<p>As the de facto second language of the United States, Spanish is the obvious next choice  to investigate. If the devices are predominantly Spanish in origin then <a href="https://twitter.com/markvillacampa/status/243381051639074816">this could solve the problem</a> introduced by our data cleaning. As Marcos Villacampa noted in a <a href="https://twitter.com/markvillacampa/status/243381051639074816">tweet</a>, in Spanish you would say &#8220;iPhone de Mark&#8221; rather than &#8220;Mark&#8217;s iPhone.&#8221;</p>
<div id="attachment_51990" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/2012/09/udid-data-analysis.html/screen-shot-2012-09-05-at-20-34-55" rel="attachment wp-att-51990"><img class=" wp-image-51990 " src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-05-at-20.34.55-620x339.png" alt="" width="558" height="305" /></a><p class="wp-caption-text">Comparing the distribution of UDID by first letter of the &#8220;Device Name&#8221; field, ignoring all names that start with the string &#8220;iP,&#8221; against the relative frequencies of the first letters of a word in the Spanish language</p></div>
<p>However, that distribution didn&#8217;t really fit either. While &#8220;t&#8221; was much better, I now had an under-abundance of words with an &#8221;e.&#8221; Although it should be noted that, unlike our English language relative frequencies, the data I was using for Spanish is for letters in the entire word, rather than letters that begin the word. That&#8217;s certainly going to introduce biases, perhaps fatal ones.</p>
<p>Not that I can really make the assumption that there is only one language present in the data, or even that one language predominates, unless that language is English.</p>
<p>At this stage it&#8217;s obvious that the data is, at least more or less, of the right order of magnitude. The data probably shows devices coming from a Western country. However, we&#8217;re a long way from the point where I&#8217;d come out and say something like &#8221; &#8230; the device names were predominantly in English.&#8221; That&#8217;s not a conclusion I can make.</p>
<p>I&#8217;d be interested in tracking down the <a href="http://en.wikipedia.org/wiki/Arabic_Letter_Frequency">relative frequency of letters used in Arabic</a> when the language is transcribed into the Roman alphabet. While I haven&#8217;t been able to find that data, I&#8217;m sure it exists somewhere. (Please drop a note in the comments if you have a lead.)</p>
<p>The next step for the analysis is to look at the names themselves. While I&#8217;m still in the process of mashing up something that will access U.S. census data and try and reverse geo-locate a name to a &#8220;most likely&#8221; geographical origin, <a href="http://publicsector.experian.co.uk/Products/Mosaic%20Origins.aspx">such services do already exist</a>. And I haven&#8217;t really pushed the boundaries here, or even started a serious statistical analysis of the subset of data released by Antisec.</p>
<p>This brings us to <a href="http://petewarden.com">Pete Warden&#8217;s</a> point that you <a href="http://strata.oreilly.com/2011/05/anonymize-data-limits.html">can&#8217;t really anonymize your data</a>. The anonymization process for large datasets such as this is simply an illusion. As Pete <a href="http://strata.oreilly.com/2011/05/anonymize-data-limits.html">wrote</a>:</p>
<blockquote><p>Precisely because there are now so many different public datasets to cross-reference, any set of records with a non-trivial amount of information on someone’s actions has a good chance of matching identifiable public records.</p></blockquote>
<p>While this release in itself is fairly harmless, a number of &#8220;harmless&#8221; releases taken together — or cleverly cross-referenced with other public sources such as Twitter, Google+, Facebook and other social media — might well be more damaging. And that&#8217;s ignoring the possibility that Antisec really might have names, addresses and telephone numbers to go side-by-side with these UDID records.</p>
<p>The question has to be asked then, where did this data originate? While 12 million records might seem a lot, compared to the <a href="http://en.wikipedia.org/wiki/IPhone#History_and_availability">number of devices sold</a> it&#8217;s not actually that big a number. There are any number of iPhone applications with a 12-million-user installation base, and this sort of backend database could easily have been built up by an independent developer with a successful application who downloaded the device owner&#8217;s contact details <a href="http://www.cultofmac.com/173128/new-ios-6-privacy-settings-limit-access-to-photos-contact-calendars-and-more/">before Apple started putting limitations</a> on that.</p>
<p>Ignoring conspiracy theories, this dataset might be the result of a single developer. Although how it got into the FBI&#8217;s possession and the why of that, if it was ever there in the first place, is another matter entirely.</p>
<p>I&#8217;m going to go on hacking away at this data to see if there are any more interesting correlations, and I do wonder whether <a href="http://en.wikipedia.org/wiki/Operation_AntiSec">Antisec</a> would consider a controlled release of the data to some trusted third party?</p>
<p>Much like the reaction to <a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">#locationgate</a>, where some people were <a href="http://crowdflow.net">happy to volunteer their data</a>, if enough users are willing to self-identify, then perhaps we can get to the bottom of where this data originated and why it was collected in the first place.</p>
<p><em>Thanks to <a href="https://twitter.com/hmason">Hilary Mason</a>, <a href="https://twitter.com/jsteeleeditor">Julie Steele</a>, <a href="https://twitter.com/ireneros">Irene Ros</a>, <a href="https://plus.google.com/112357111574249260299/about">Gemma Hobson</a> and <a href="https://twitter.com/MarkVillacampa">Marcos Villacampa</a> for ideas, pointers to comparative data sources, and advice on visualisation of the data.</em></p>
<h2>Update</h2>
<p><em>9/6/12</em></p>
<p>In <a href="https://plus.google.com/117841261693434574785/posts/Ni1AAFn27ZN">response</a> to a post about this article on Google+, <a href="https://plus.google.com/104522326787934519002/posts">Josh Hendrix</a> made the suggestion that I should look at word as well as letter frequency. It was a good idea, so I went ahead and wrote a quick script to do just that&#8230;</p>
<p>The top two words in the list are &#8220;iPad,&#8221; which occurs 445,111 times, and &#8220;iPhone,&#8221; which occurs 252,106 times. The next most frequent word is &#8220;iPod,&#8221; but that occurs only 36,367 times. This result backs up my earlier result looking at distribution by device type.</p>
<p>Then there are various misspellings and mis-capitalisations of &#8220;iPhone,&#8221; &#8220;iPad,&#8221; and &#8220;iPod.&#8221;</p>
<p>The first real word that isn&#8217;t an Apple trademark is &#8220;Administrator,&#8221; which occurs 10,910 times. Next are &#8220;David&#8221; (5,822), &#8220;John&#8221; (5,447), and &#8220;Michael&#8221; (5,034). This is followed by &#8220;Chris&#8221; (3,744), &#8220;Mike&#8221; (3,744), &#8220;Mark&#8221; (3,66) and &#8220;Paul&#8221; (3,096).</p>
<p>Looking down the list of real names, as opposed to partial strings and tokens, the first female name doesn&#8217;t occur until we&#8217;re 30 places down the list — it&#8217;s &#8220;Lisa&#8221; (1,732) with the next most popular female name being &#8220;Sarah&#8221; (1,499), in 38th place.</p>
<div id="attachment_52048" class="wp-caption aligncenter" style="width: 568px"><a href="http://radar.oreilly.com/?attachment_id=52048"><img class="size-large wp-image-52048" src="http://s.radar.oreilly.com/wp-files/2/2012/09/Screen-Shot-2012-09-06-at-18.24.43-620x230.png" alt="" width="558" /></a><p class="wp-caption-text">The top 100 names occurring in the UDID list.</p></div>
<p>The word &#8220;Dad&#8221; occurs 1,074 times, with &#8220;Daddy&#8221; occurring 383 times. For comparison the word &#8220;Mum&#8221; occurs just 58 times, and &#8220;Mummy&#8221; just 33. &#8220;Mom&#8221; came in with 150 occurrences, and &#8220;mommy&#8221; with 30. The number of occurrences for &#8220;mum,&#8221; &#8220;mummy,&#8221; &#8220;mom,&#8221; and &#8220;mommy&#8221; combined is 271, which is still very small compared to the combined total of 1,457 for &#8220;dad&#8221; and &#8220;daddy.&#8221;</p>
<p><em>[<strong>Updated:</strong> <a href="https://twitter.com/gyardley/">Greg Yardly</a> wisely <a href="https://twitter.com/gyardley/status/243784351265984513">pointed out on Twitter</a> that I was being a bit English-centric in only looking for the words "mum" and "mummy," which is why I expanded the scope to include "mom" and "mommy."]</em></p>
<p>There is a definite gender bias here, and I can think of at least a few explanations. The most likely is fairly simplistic: The application where the UDID numbers originated either appeals to, or is used more, by men.</p>
<p>Alternatively, women may be less likely to include their name in the name of their device, perhaps because amongst other things this name is used to advertise the device on wireless networks?</p>
<p>Either way I think this definitively pins it down as a list of devices originating in an Anglo-centric geographic region.</p>
<p>Sometimes the simplest things work better. Instead of being fancy perhaps I should have done this in the first place. However this, combined with my previous results, suggest that we&#8217;re looking at an English speaking, mostly male, demographic.</p>
<p>Correlating the top 20 or so names and with the list of most popular baby names (by year) all the way from the mid-&#8217;60s up until the mid-&#8217;90s (so looking at the most popular names for people between the ages of say 16 and 50) might give a further clue as to the exact demographic involved.</p>
<p>Both <a href="https://plus.google.com/112357111574249260299/posts">Gemma Hobson</a> and <a href="http://radar.oreilly.com/julies">Julie Steele</a> directed me toward the U.S. Social Security Administration&#8217;s <a href="http://www.ssa.gov/oact/babynames/decades/index.html">Popular Baby Names By Decade</a> list. A quick and dirty analysis suggests that the UDID data is dominated by names that were most popular in the &#8217;70s and &#8217;80s. This maps well to my previous suggestion that the lack of iPod Touch usage might suggest that the demographic was older.</p>
<p>I&#8217;m going to do a <a href="http://www.wolframalpha.com/input/?i=David">year-by-year breakdown</a> and some proper statistics later on, but we&#8217;re looking at an application that&#8217;s probably used by: English speaking males with an Anglo-American background in their 30s or 40s. It&#8217;s most used on the iPad, and although it also works on the iPhone, it&#8217;s used far less on that platform.</p>
<p><em>Thanks to <a href="https://plus.google.com/104522326787934519002/posts">Josh Hendrix</a>, and again to <a href="https://plus.google.com/112357111574249260299/about">Gemma Hobson</a> and <a href="https://twitter.com/jsteeleeditor">Julie Steele</a>, for ideas and pointers to sources for this part of the analysis.</em></p>
<h2>Update</h2>
<p><em>9/11/12</em></p>
<p>A <a href="http://intrepidusgroup.com/insight/2012/09/tracking-udid-src/">really nice analysis</a> from David Schuetz uses the frequency of UDID duplicates and the names of those devices to track down the source of the leak. I really should have thought of that.</p>
<p>Interestingly, however, it does support my own analysis. Yesterday, a Florida publishing company named <a href="http://www.bluetoad.com/BlueToad/">BlueToad</a> said the <a href="http://redtape.nbcnews.com/_news/2012/09/10/13781440-exclusive-the-real-source-of-apple-device-ids-leaked-by-anonymous-last-week">UDID data was taken from their systems</a>. BlueToad makes apps for magazine publishers, hence the predominance of of the iPad over the iPhone in my results, as those apps are more often used on the iPad.</p>
<p>Also they seem to mostly market into the U.S., which supports my ethnicity findings, and looking at <a href="http://www.coverstand.com">the list of titles</a> they curate, it does look like my demographics are more-or-less spot on as well. Those look like magazines marketed to men in their 30s and 40s to me.</p>
<p>I&#8217;d actually been really confused about what type of app could possibly have that narrow a demographic, and this sort of clears up my confusion. Nice!</p>
<p><strong>Related:</strong></p>
<ul>
<li><a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">Got an iPhone or 3G iPad? Apple is recording your moves</a></li>
<li><a href="http://radar.oreilly.com/2011/03/japan-radiation-visualizations.html">Radiation visualizations paint a different picture of Japan</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/09/udid-data-analysis.html/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Mining the astronomical literature</title>
		<link>http://radar.oreilly.com/2012/08/data-mining-the-literature.html</link>
		<comments>http://radar.oreilly.com/2012/08/data-mining-the-literature.html#comments</comments>
		<pubDate>Wed, 15 Aug 2012 13:00:12 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Edu 2.0]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[academia]]></category>
		<category><![CDATA[academic publishing]]></category>
		<category><![CDATA[astronomy]]></category>
		<category><![CDATA[astrophysics]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[open access]]></category>
		<category><![CDATA[open data]]></category>
		<category><![CDATA[peer review]]></category>
		<category><![CDATA[publishing]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=50681</guid>
		<description><![CDATA[There is a huge debate right now about making academic literature freely accessible and moving toward open access. But what would be possible if people stopped talking about it and just dug in and got on with it? NASA&#8217;s Astrophysics &#8230; ]]></description>
				<content:encoded><![CDATA[<p>There is a <a href="http://www.guardian.co.uk/science/2012/jun/19/open-access-academic-publishing-finch-report?intcmp=239">huge debate</a> right now about making academic literature freely accessible and moving toward open access. But what would be possible if people stopped talking about it and just dug in and got on with it?</p>
<p><a title="NASA" href="http://nasa.gov">NASA&#8217;s</a> <a title="Astrophysics Data System" href="http://adswww.harvard.edu">Astrophysics Data System</a> (ADS), hosted by the <a title="Smithsonian Astrophysical Observatory" href="http://www.cfa.harvard.edu/sao/">Smithsonian Astrophysical Observatory</a> (SAO), has quietly been working away since the mid-&#8217;90s. Without much, if any, fanfare amongst the other disciplines, it has moved astronomers into a world where access to the literature is just a given. It&#8217;s something they don&#8217;t have to think about all that much.</p>
<p>The <a href="http://adsabs.harvard.edu/abstract_service.html">ADS service</a> provides access to abstracts for virtually all of the astronomical literature. But it also provides access to the full text of more than half a million papers, going right back to the start of peer-reviewed journals in the 1800s. The service has links to online data archives, along with reference and citation information for each of the papers, and it&#8217;s all <a title="searchable" href="http://adsabs.harvard.edu/abstract_service.html">searchable</a> and downloadable.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;">
<a href="https://dl.dropbox.com/u/2068615/Orbiting%20Frog/publishing-rates.html"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-1-number-of-papers.png" alt="Number of papers published in the three main astronomy journals each year" width="500" style="margin-bottom:15px;" /></a><br />
<em>Number of papers published in the three main astronomy journals each year. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a></em></p>
</div>
<p>The existence of the <a href="http://adsabs.harvard.edu/abstract_service.html">ADS</a>, along with the <a href="http://arxiv.org/">arXiv</a> pre-print server, has meant that most astronomers haven&#8217;t seen the inside of a brick-built library since the late 1990s.</p>
<p>It also makes astronomy almost uniquely well placed for interesting data mining experiments, experiments that hint at what the rest of academia could do if they followed astronomy&#8217;s lead. The fact that the discipline&#8217;s literature has been scanned, archived, indexed and catalogued, and placed behind a RESTful API makes it a treasure trove, both for hypothesis generation and sociological research.</p>
<p><span id="more-50681"></span>For example, the <a title=".Astronomy" href="http://dotastronomy.com">.Astronomy</a> series of conferences is a small workshop that brings together the best and the brightest of the technical community: researchers, developers, educators and communicators. Billed as <em>&#8220;20% time for astronomers,&#8221;</em> it gives these people space to think about how the new technologies affect both how research and communicating research to their peers and to the public is done. </p>
<p><em>[Disclosure: I'm a member of the advisory board to the .Astronomy conference, and I previously served as a member of the programme organising committee for the conference series.]</em></p>
<p>It should perhaps come as little surprise that one of the more interesting projects to come out of a hack day held as part of this year&#8217;s .Astronomy meeting <a href="http://www.haus-der-astronomie.de/en/">in Heidelberg</a> was work by <a href="http://orbitingfrog.com">Robert Simpson</a>, <a href="http://icg.port.ac.uk/~mastersk/">Karen Masters</a> and <a href="http://sarahaskew.net">Sarah Kendrew</a> that focused on data mining the astronomical literature.</p>
<p>The team <a href="http://orbitingfrog.com/post/27983055767/mining-the-astronomical-literature">grabbed and processed</a>  the titles and abstracts of all the papers from the <a href="http://iopscience.iop.org/0004-637X/">Astrophysical Journal</a> (ApJ), <a href="http://www.aanda.org">Astronomy &amp; Astrophysics</a> (A&amp;A), and the <a href="http://eu.wiley.com/WileyCDA/WileyTitle/productCd-MNR.html">Monthly Notices of the Royal Astronomical Society</a> (MNRAS) since each of those journals started publication &mdash; and that&#8217;s 1827 in the case of MNRAS.</p>
<p>By the end of the day, they&#8217;d found some interesting results showing how various terms have trended over time. The results were  similar to what&#8217;s found in Google Books&#8217; <a href="http://books.google.com/ngrams/graph?content=Astronomy%2C+Astrophysics&amp;year_start=1800&amp;year_end=2000&amp;corpus=0&amp;smoothing=3">Ngram Viewer</a>.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><a href="https://dl.dropbox.com/u/2068615/Orbiting%20Frog/sample-trends.html"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-3-telescopes.png" alt="The relative popularity of the names of telescopes in the literature" width="500" style="margin-bottom: 15px;" /></a><br />
<em>The relative popularity of the names of telescopes in the literature. Hubble, Chandra and Spitzer seem to have taken turns in hogging the limelight, much as COBE, WMAP and Planck have each contributed to our knowledge of the cosmic microwave background in successive decades. References to Planck are still on the rise. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a>.</em></p>
</div>
<p>After the meeting, however, Robert has taken his <a href="http://orbitingfrog.com/post/27983055767/mining-the-astronomical-literature">initial results</a> and explored the astronomical literature and his new corpus of data on the literature. He&#8217;s explored various visualisations of the data, including <a href="http://orbitingfrog.com/post/28143196783/more-astronomy-data-mining-its-word-matrix-time">word matrixes</a> for related terms and for various <a href="http://orbitingfrog.com/post/28434621487/astrochemistry-word-matrix">astro-chemistry</a>.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><a href="http://dl.dropbox.com/u/2068615/Orbiting%20Frog/MAtrix/matrix.html"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-4-agn.png" alt="Correlation between terms related to Active Galactic Nuclei" width="500" style="margin-bottom: 15px;" /></a><br /><em>Correlation between terms related to Active Galactic Nuclei (AGN). The opacity of each square represents the strength of the correlation between the terms. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a>.</em></p>
</div>
<p>He&#8217;s also taken a look at <a href="http://orbitingfrog.com/post/28714839175/authorship-in-astronomy">authorship in astronomy</a> and is starting to find some interesting trends.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-5-authors.png" alt="Fraction of astronomical papers published with one, two, three, four or more authors" width="500" style="margin-bottom: 15px;" /><br /><em>Fraction of astronomical papers published with one, two, three, four or more authors. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a></em></p>
</div>
<p>You can <a href="http://orbitingfrog.com/post/28714839175/authorship-in-astronomy">see</a> that single-author papers dominated for most of the 20th century. Around 1960, we see the decline begin, as two- and three-author papers begin to become a significant chunk of the whole. In 1978, author papers become more prevalent than single-author papers.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-6-active-researchers.png" alt="Compare the number of active research astronomers to the number of papers published each year" width="500" style="margin-bottom: 15px;" /><br /><em>Compare the number of &#8220;active&#8221; research astronomers to the number of papers published each year (across all the major journals). CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a>.</em></p>
</div>
<p>Here we see that people begin to outpace papers in the 1960s. This may reflect the fact that as we get more technical as a field, and more specialised, it takes more people to write the same number of papers, which is a sort of interesting result all by itself.</p>
<h2>Interview with Robert Simpson: Behind the project and what lies ahead</h2>
<p>I recently talked with <a href="http://orbitingfrog.com">Rob</a> about the work he, <a href="http://icg.port.ac.uk/~mastersk/">Karen Masters</a>, and <a href="http://sarahaskew.net">Sarah Kendrew</a> did at the meeting, and the work he&#8217;s been doing since with the newly gathered data.</p>
<p><strong>What made you think about data mining the <a href="http://adsabs.harvard.edu/abstract_service.html">ADS</a>?</strong></p>
<p><strong>Robert Simpson:</strong> At the <a href="http://dotastronomy.com/">.Astronomy</a> 4 Hack Day in July, <a href="http://sarahaskew.net/">Sarah Kendrew</a> had the idea to try to do an astronomy version of <a href="http://www.brainscanr.com/">BrainSCANr</a>, a project that generates new hypotheses in the neuroscience literature. I&#8217;ve had a go at mining <a href="http://adsabs.harvard.edu/abstract_service.html">ADS</a> and <a href="http://arxiv.org/">arXiv</a> before, so it seemed like a great excuse to dive back in.</p>
<p><strong>Do you think there might be actual science that could be done here?</strong></p>
<p><strong>Robert Simpson:</strong> Yes, in the form of finding questions that were unexpected. With such large volumes of peer-reviewed papers being produced daily in astronomy, there is a lot being said. Most researchers can only try to keep up with it all &mdash; my daily RSS feed from <a href="http://arxiv.org/">arXiv</a> is next to useless, it&#8217;s so bloated. In amongst all that text, there must be connections and relationships that are being missed by the community at large, hidden in the chatter. Maybe we can develop simple techniques to highlight potential missed links, i.e. generate new hypotheses from the mass of words and data.</p>
<p><strong>Are the results coming out of the work useful for auditing academics?</strong></p>
<p><strong>Robert Simpson:</strong> Well, perhaps, but that would be tricky territory in my opinion. I&#8217;ve only just begun to explore the data around authorship in astronomy. One thing that is clear is that we can see a big trend toward collaborative work. In 2012, only 6% of papers were single-author efforts, compared with 70+% in the 1950s.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-7-avg-number-authors.png" alt="The average number of authors per paper since 1827" style="margin-bottom: 15px;"/><br />
<em>The above plot shows the average number of authors, per paper since 1827. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a>.</em></p>
</div>
<p>We can measure how large groups are becoming, and who is part of which groups. In that sense, we can audit research groups, and maybe individual people. The big issue is keeping track of people through variations in their names and affiliations. Identifying authors is probably a solved problem if we look at <a href="http://about.orcid.org/">ORCID</a>.</p>
<p><strong>What about citations? Can you draw any comparisons with h-index data?</strong></p>
<p><strong>Robert Simpson:</strong> I haven&#8217;t looked at h-index stuff specifically, at least not yet, but citations are fun. I looked at the trends surrounding the term &#8220;dark matter&#8221; and saw something interesting. Mentions of dark matter rise steadily after it first appears in the late &#8217;70s.</p>
<div align="center">
<p style="width: 500px; height: auto; padding: 10px; margin: 15px 0 15px 0; border: 1px solid #ddd; font-style: italic; text-align: left;"><a href="https://dl.dropbox.com/u/2068615/Orbiting%20Frog/sample-trends.html"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-8-dark-matter.png" alt="Compare the term dark matter with related terms" width="500" style="margin-bottom: 15px;" /></a><br />
<em>Compare the term &#8220;dark matter&#8221; with a few other related terms: &#8220;cosmology,&#8221; &#8220;big bang,&#8221; &#8220;dark energy,&#8221; and &#8220;wmap.&#8221; You can see cosmology has been getting more popular since the 1990s, and dark energy is a recent addition. CREDIT: <a href="http://orbitingfrog.com">Robert Simpson</a>.</em></p>
</div>
<p>In the data, astronomy becomes more and more obsessed with dark matter &mdash; the term appears in 1% of all papers by the end of the &#8217;80s and 6% today. </p>
<p>Looking at citations changes the picture. The community is writing papers about dark matter more and more each year, but they are getting fewer citations than they used to (the peak for this was in the late &#8217;90s). These trends are normalised, so the only regency effect I can think of is that dark matter papers take more than 10 years to become citable. Either that or dark matter studies are currently in a trough for impact.</p>
<p><strong>Can you see where work is dropped by parts of the community and picked up again?</strong></p>
<p><strong>Robert Simpson:</strong> Not yet, but I see what you mean. I need to build a better picture of the community and its components.</p>
<p><strong>Can you build a social graph of astronomers out of this data? What about (academic) family trees?</strong></p>
<p><strong>Robert Simpson:</strong> Identifying unique authors is my next step, followed by creating fingerprints of individuals at a given point in time. When do people create their first-author papers, when do they have the most impact in their careers, stuff like that.</p>
<p><strong>What tools did you use? In hindsight, would you do it differently?</strong></p>
<p>I&#8217;m using <a title="Ruby" href="http://www.ruby-lang.org/en/">Ruby</a> and <a title="Perl" href="http://www.perl.org">Perl</a> to grab the data, <a title="MySQL" href="http://www.mysql.com">MySQL</a> to store and query it, JavaScript to display it (<a title="Google Charts" href="https://developers.google.com/chart/">Google Charts</a> and <a title="D3.js" href="http://d3js.org">D3.js</a>). I may still move the database part to <a title="MongoDB" href="http://www.mongodb.org">MongoDB</a> because it was designed to store documents. Similarly, I may switch from <a href="http://adsabs.harvard.edu/abstract_service.html">ADS</a> to <a href="http://arxiv.org/">arXiv</a> as the data source. Using <a href="http://arxiv.org/">arXiv</a> would allow me to grab the full text in many cases, even if it does introduce a peer-review issue.</p>
<p><strong>What&#8217;s next?</strong></p>
<p><strong>Robert Simpson:</strong> My aim is still to attempt real hypothesis generation. I&#8217;ve begun the process by investigating correlations between terms in the literature, but I think the power will be in being able to compare all terms with all terms and looking for the unexpected. Terms may correlate indirectly (via a third term, for example), so the entire corpus needs to be processed and optimised to make it work comprehensively.</p>
<h2>Science between the cracks</h2>
<p>I&#8217;m really looking forward to seeing more results coming out of Robert&#8217;s work. This sort of analysis hasn&#8217;t really been possible before. It&#8217;s showing a lot of promise both from a sociological angle, with the ability to do research into how science is done and how that has changed, but also ultimately as a hypothesis engine  &mdash; something that can generate new science in and of itself. This is just a hack day experiment. Imagine what could be done if the literature were more open and this sort of analysis could be done across fields?</p>
<p>Right now, a lot of the most interesting science is being done in the cracks between disciplines, but the hardest part of that sort of work is often trying to understand the literature of the discipline that isn&#8217;t your own. Robert&#8217;s project offers a lot of hope that this may soon become easier.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/08/data-mining-the-literature.html/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>They promised us flying cars</title>
		<link>http://radar.oreilly.com/2012/08/flying-cars-build-future-make-diy.html</link>
		<comments>http://radar.oreilly.com/2012/08/flying-cars-build-future-make-diy.html#comments</comments>
		<pubDate>Fri, 03 Aug 2012 15:00:26 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[Commercial Space]]></category>
		<category><![CDATA[DIY]]></category>
		<category><![CDATA[Dragon]]></category>
		<category><![CDATA[ISS]]></category>
		<category><![CDATA[Make]]></category>
		<category><![CDATA[maker]]></category>
		<category><![CDATA[nasa]]></category>
		<category><![CDATA[open hardware]]></category>
		<category><![CDATA[Orbit]]></category>
		<category><![CDATA[space]]></category>
		<category><![CDATA[Space Shuttle]]></category>
		<category><![CDATA[Space Station]]></category>
		<category><![CDATA[SpaceX]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2012/05/those-crazy-guys-and-their-fly.html</guid>
		<description><![CDATA[We may be living in the future, but it hasn&#8217;t entirely worked out how we were promised. I remember the predictions clearly: the 21st century was supposed to be full of self-driving cars, personal communicators, replicators and private space ships. &#8230; ]]></description>
				<content:encoded><![CDATA[<p>We may be living in the future, but it hasn&#8217;t entirely worked out how we were promised. I remember the predictions clearly: the 21st century was supposed to be full of self-driving cars, personal communicators, replicators and private space ships.</p>
<p>Except, of course, all that has come true. <a href="http://google.com/">Google</a> just got the <a href="http://arstechnica.com/tech-policy/2012/05/google-gets-license-to-test-drive-autonomous-cars-on-nevada-roads/">first license to drive their cars entirely autonomously on public highways</a>. <a href="http://apple.com/">Apple</a> came along with the <a href="http://apple.com/iphone/">iPhone</a> and changed everything. Three-dimensional printers have come out of the laboratories and <a href="http://www.makerbot.com/">into the home</a>. And in a few short years, and from a standing start, <a href="http://en.wikipedia.org/wiki/Elon_Musk">Elon Musk</a> and <a href="http://www.spacex.com/">SpaceX</a> has achieved what might otherwise have been thought impossible: late last year, SpaceX launched a spacecraft and returned it to Earth safely. Then they launched another, successfully docked it with the <a href="http://en.wikipedia.org/wiki/International_Space_Station">International Space Station</a>, and then again returned it to Earth.</p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/Lg5vd_Gs0G4?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p><em>The SpaceX Dragon capsule is grappled and berthed to the Earth-facing port of the International Space Station&#8217;s Harmony module at 12:02 p.m. EDT, May 25, 2012. Credit: NASA/SpaceX</em></p>
<hr />
<p>Right now there is a generation of high-tech tinkerers breaking the seals on proprietary technology and prototyping new ideas, which is leading to a rapid growth in innovation. The members of this generation, who are building open hardware <a href="http://radar.oreilly.com/2012/07/open-source-won.html">instead of writing open software</a>, seem to have come out of nowhere. Except, of course, they haven&#8217;t. Promised a future they couldn&#8217;t have, they&#8217;ve started to build it. The only difference between them and Elon Musk, Jeff Bezos, Sergey Brin, Larry Page and Steve Jobs is that those guys got to build bigger toys than the rest of us.</p>
<p>The dotcom billionaires are regular geeks just like us. They might be the best of us, or sometimes just the luckiest, but they grew up with the same dreams, and they&#8217;ve finally given up waiting for governments to build the future they were promised when they were kids. They&#8217;re going to build it for themselves.</p>
<p><span id="more-48290"></span>The thing that&#8217;s driving the Maker movement is the same thing that&#8217;s driving bigger shifts, like the next space race. Unlike the old space race, pushed by national pride and the hope that we could run fast enough in place so that we didn&#8217;t have to start a nuclear war, this new space race is being driven by personal pride, ambition and childhood dreams.</p>
<p>But there are some who don&#8217;t see what&#8217;s happening, and they&#8217;re about to miss out. Case in point: a lot of big businesses are confused by the open hardware movement. They don&#8217;t understand it, don&#8217;t think it&#8217;s worth their while to make exceptions and cater to it. Even the so-called &#8220;smart money&#8221; doesn&#8217;t seem to get it. I&#8217;ve heard moderately successful venture capitalists from the Valley say that they <em>&#8220;&#8230; don&#8217;t do hardware.&#8221;</em> Those guys are about to lose their shirts.</p>
<p>Makers are geeks like you and me who have decided to go ahead and build the future themselves because the big corporations and the major governments have so singularly failed to do it for us. Is it any surprise that dotcom billionaires are doing the same? Is it any surprise that the future we build is going to look a lot like the future we were promised and not so much like the future we were heading toward?</p>
<p><strong>Related:</strong></p>
<ul>
<li><a href="http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html">The next, next big thing</a></li>
<li><a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">The secret is to bang the rocks together</a></li>
<li><a href="http://radar.oreilly.com/2012/05/asteroid-mining-ambition.html">Utopia on a budget: A completely practical plan for regaining paradise</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/08/flying-cars-build-future-make-diy.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Tertiary data: Big data&apos;s hidden layer</title>
		<link>http://radar.oreilly.com/2012/03/hidden-data-exhaust-leakage-location.html</link>
		<comments>http://radar.oreilly.com/2012/03/hidden-data-exhaust-leakage-location.html#comments</comments>
		<pubDate>Mon, 19 Mar 2012 13:00:00 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@radaronly]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[android]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[hidden data]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[migratory data]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[mobile data]]></category>
		<category><![CDATA[tracking]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2012/03/hidden-data-exhaust-leakage-location.html</guid>
		<description><![CDATA[Big data isn&apos;t limited to multi-terabyte datasets or data markets. It also includes the hidden data you carry with you all the time and the growing data on your movements, contacts and social interactions.  ]]></description>
				<content:encoded><![CDATA[<div style="float: right; margin: 3px 0 10px 10px; padding: 2px 4px 0 15px; border-left: 1px solid #ddd;">
<p style="background: #990000; width: 250px; color: #fff; font-size: .9em; font-weight: bold; padding: 2px 0 2px 4px; margin: 0 0 3px 0;">Sections</p>
<ul style="margin-top: 10px; padding-right: 4px;">
<li> <a href="#data-carry">The data you carry with you</a></li>
<li> <a href="#data-leakage">Data leakage</a></li>
<li> <a href="#location">Location, location, location</a></li>
<li> <a href="#data-exhaust">Data exhaust</a></li>
<li> <a href="#enemies">Keep your friends close, and<br /> your enemies closer</a></li>
<li> <a href="#data-sharing">Data sharing</a></li>
<li> <a href="#building-platforms">Building platforms</a></li>
<li> <a href="#a-thought">A thought to ponder</a></li>
</ul>
</div>
<p>Big data isn&#8217;t just about multi-terabyte datasets hidden inside eventually-concurrent distributed databases in the cloud, or enterprise-scale data warehousing, or even the emerging market in data. It&#8217;s also about the hidden data you carry with you all the time; the slowly growing datasets on your movements, contacts and social interactions.</p>
<p>Until recently, most people&#8217;s understanding of what can actually be done with the data collected about us by our own cell phones was theoretical. There were few real-world examples. But over the last couple of years, this has changed dramatically. Courting hubris perhaps, but I must admit it&#8217;s <a href="http://www.guardian.co.uk/technology/2011/apr/20/iphone-tracking-prompts-privacy-fears">possible some of that was my fault</a>, though I haven&#8217;t been alone.</p>
<h2 id="data-carry">The data you carry with you</h2>
<p>You probably think you know how much data you carry around with you on your cell phone. You&#8217;ll certainly be aware of it if you&#8217;ve ever lost your phone, or had it stolen, or it&#8217;s just plain stopped working. But there is a large amount of data in the background that isn&#8217;t surfaced in the user interface.</p>
<p>We know about what I generally call <strong>primary data</strong>:  calendars, address books, photographs, SMS messages and browser bookmarks. These are usually user generated, and we&#8217;d be pretty unhappy if we lost them. There is also the <strong>secondary data</strong> that the phone generates about us:  call history, voice mail, usage information and records of our current and past locations. Most of what I&#8217;d call secondary data is surfaced to us in our phone&#8217;s user interface. We generally can&#8217;t change this sort of information without resetting the phone to a factory fresh condition; it&#8217;s generated by the device for us, it&#8217;s not something we generate ourselves.</p>
<p>But there is also what I refer to as <strong>tertiary data</strong>. This is data that, similar to the examples I mentioned above, is generated about us, rather than by us. Mostly, this data consists of cache files &mdash; data that is entirely necessary to you using the device, or significantly improves your user experience, but you don&#8217;t necessarily know is there. At least until some hole is found in the operating system to expose that data layer to you. That&#8217;s <a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">happened before</a>, after all.</p>
<p>An obvious example is tucked in your photographs. Every picture you take is geotagged and date stamped, and if you publish your pictures to a photo-sharing site without stripping that information, you&#8217;re leaking data. Back in 2007, when <a href="http://www.army.mil/article/75165/Geotagging_poses_security_risks/">geotagged photographs of newly arrived helicopters at a U.S. Army base in Iraq</a> were published to the Internet, they allowed insurgents to determine the exact location of the helicopters inside the compound and conduct a mortar attack. Four of the AH-64 Apaches on the flight line were destroyed in the attack.</p>
<div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"><a href="https://en.oreilly.com/where2012/public/regwith/radar20?cmp=il-radar-wh12-tertiary-data-alasdair-allan"><img style="float: left; border: none; padding-right: 10px;" src="http://blogs.oreilly.com/wp/wp-content/uploads/2012/01/where2012-radar201.png" /></a><a href="https://en.oreilly.com/where2012/public/regwith/radar20?cmp=il-radar-wh12-tertiary-data-alasdair-allan"><strong>Where Conference 2012</strong></a> &mdash;  O&#8217;Reilly&#8217;s Where Conference, being held April 2-4 in San Francisco,  is where the people working on and using location technologies explore emerging trends in software development, tools, business strategies and marketing.</p>
<p><a href="https://en.oreilly.com/where2012/public/regwith/radar20?cmp=il-radar-wh12-tertiary-data-alasdair-allan"><strong>Save 20% on registration with the code RADAR20</strong></a></div>
<h2 id="data-leakage">Data leakage</h2>
<p>Recently, there have been a number of high-profile cases of data leakage, and the one that has <a href="http://mclov.in/2012/02/08/path-uploads-your-entire-address-book-to-their-servers.html">raised the most controversy</a>, at least until the next time, is the social network <a href="https://path.com/">Path</a>.</p>
<p>Upon opening the Path application on your phone, it automatically uploaded your address book to Path&#8217;s servers so it could find <em>&#8220;friends&#8221;</em> that you might want to connect to without asking for explicit permission to do so, or even implicit permission for that matter. Path has since apologized and updated its application so that it now asks permission before pushing your address book to its servers. </p>
<p>This was not data theft, but data leakage. You asked the application to accomplish something and didn&#8217;t really ask yourself how it was doing it. While there are technical solutions that <a href="http://mattgemmell.com/2012/02/11/hashing-for-privacy-in-social-apps/">don&#8217;t involve uploading your address book</a>, the laziest solution is probably what you should have expected. I can almost hear the developers, <em>&#8220;&#8230; we&#8217;ll just upload the address book for now and switch to hashing later on when we have time.&#8221;</em></p>
<p>There has been a lot of comment that somehow the whole Path thing was unexpected. Realistically, that&#8217;s not the case. It&#8217;s not an isolated circumstance, either. To the best of my knowledge <a href="http://techcrunch.com/2012/02/08/hipster-ceo-also-apologizes-for-address-book-gate-calls-for-application-privacy-summit-guest-post/">Hipster</a> and <a href="http://www.theverge.com/2012/2/14/2798008/ios-apps-and-the-address-book-what-you-need-to-know">other apps</a> also tapped your address book behind the scenes without asking permission. Interestingly, there are other, less obvious, culprits. Applications that make use of <a href="http://www.chillingo.com/">Chillingo&#8217;s</a> &#8220;<a href="http://www.chillingo.com/crystal/">Crystal</a>&#8221; game service, like <a href="http://www.angrybirds.com/">Angry Birds</a>, will in some circumstances also <a href="http://www.thedaily.com/page/2012/02/16/021612-tech-apps-security-1-2/">upload your address book</a>. While there is a button to push, it is, at least for me, misleadingly labelled and doesn&#8217;t suggest what&#8217;s going to happen next.</p>
<p>Data leakage like this is not really a solvable problem at the user level, at least not in real-time. Having multiple permission boxes pop up at regular intervals is a bad design choice; users stop reading them, they lose importance and become ineffective. Just try using <a href="http://windows.microsoft.com/">Microsoft Windows</a> and you&#8217;ll understand exactly what I mean. <a href="http://en.wikipedia.org/wiki/Modal_window">Modal interrupts</a> should be reserved for vital time-critical issues. They&#8217;re already used far to prolifically in <a href="http://www.apple.com/ios/">iOS</a>. Run the Mail application with multiple mail accounts configured when you&#8217;re not connected to the network and that&#8217;ll become instantly obvious. You&#8217;ll be bombarded by error messages.</p>
<p>I did have a thought that you might be able to deploy a <a href="http://mitmproxy.org/">customized web proxy</a> directly onto your mobile device and have all web requests directed through it. The proxy would sift through the outgoing network connections in a (semi-)Bayesian manner looking for data that you don&#8217;t want transmitted and stop the application cold before it  sends it to the remote server. Basically, it&#8217;s acting as a reverse spam filter, or a smart firewall, depending on how you want to think about it. </p>
<p>I think that something like this could well be far more effective at stopping data leakage than the current solution, which <a href="http://google.com">Google</a> has used on <a href="http://www.android.com/">Android</a>: Permissions pages when you initially install an application are all very well, but most people don&#8217;t read them, and when you&#8217;re installing an application you&#8217;re not really thinking about why it might need certain permissions. However, you can be very clear about what data you&#8217;re interested in not leaving your phone. One configuration page for the proxy, rather than multiple ones, every time you install an application. Like modal dialogs on the iPhone, you subconsciously start to ignore them, to your peril.</p>
<h2 id="location">Location, location, location</h2>
<p>Of course, I can&#8217;t really talk about data leakage without mentioning the <a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">kerfuffle surrounding location and data privacy</a> that happened just about this time last year. Unsurprisingly, the file in question still exists, despite some of the press stories; the existence of the file was never the problem. A cache of that nature is fairly necessary if you want to have reliable and timely location services on your phone. However, the file is now actually just that, a cache, and it is regularly swept clean by the operating system. It&#8217;s also not included in your usually unencrypted backups to your laptop, which was perhaps more of a problem than the fact it wasn&#8217;t being cleared out in the first place.</p>
<p class="image-box-580"><a href="http://radar.oreilly.com/assets_c/2011/04/DC%20and%20NY.html"><img src="http://s.radar.oreilly.com/2011/04/20/042011-iphonetracker.png" border="0" alt="iPhoneTracker screen" width="580" style="margin-bottom: 15px;" /></a><br />
A visualization of iPhone location data. <a href="http://radar.oreilly.com/assets_c/2011/04/DC%20and%20NY.html">Click to enlarge</a>.</p>
<p>What Apple was doing was taking a piece of tertiary data, generated about you by the device, and then exposing it on a platform (laptop or desktop) where accessing that data was easier. There are a lot of people who know how to navigate a file system on a computer, but a lot fewer who would know how to get the same data directly from the phone itself. It was a classic case of data leakage: data moved from a secure(ish) environment on the phone to a less secure one on the computer.</p>
<h2 id="data-exhaust">Data exhaust</h2>
<p>Back in the days of floppy disks, the lines of ownership were pretty clear. If you had the disk, the data was yours. If someone else had it, it was theirs. Things these days are much blurrier. That tertiary data &mdash; data that&#8217;s generated about us but not by us  &mdash; doesn&#8217;t just build up on your mobile devices of course. Other people are building datasets about our patterns of movement, buying decisions, credit worthiness and other things. The ability to compile these sorts of datasets left the realm of major governments with the invention of the computer. </p>
<p>We&#8217;re all aware of this, and there&#8217;s even a provocative buzzword to describe it: data exhaust. It&#8217;s the data we leave behind us, rather than carry with us.</p>
<p>In the U.S., data from <a href="http://www.villagevoice.com/2002-07-23/news/buying-trouble/">grocery store loyalty schemes</a> has been used by security services to search for terrorist suspects. Turns out the number of toilet rolls you buy can be quite telling.</p>
<p>Which does make me think, instead of being afraid of the data exhaust, perhaps we should embrace it. In the U.K., the biggest retailer is the supermarket <a href="http://www.tesco.com/">Tesco</a>. Like many, I spend a good fraction of my income there, and like almost everyone I know, I have a <a href="http://www.tesco.com/clubcard/clubcard/">Tesco Clubcard</a>. This is a loyalty card that has a record of (almost) every purchase I make, from toilet rolls to roast chicken.</p>
<p>I&#8217;d actually pay good money for a copy of my own Clubcard data, so long as it was actually in a machine-readable format, not on paper. Although for Tesco, the data is only really interesting in aggregate; it&#8217;s the fact that they have millions of Clubcard records that makes the dataset useful to the company. To me, a history of my purchases would be useful data.</p>
<p>Of course, people have already started selling our  data exhaust back to us. Think about your <a href="http://www.creditreport.com/">credit report</a>, for instance.</p>
<h2 id="enemies">Keep your friends close, and your enemies closer</h2>
<p>It&#8217;s not just your own data exhaust that you have to worry about. There was an <a href="http://www.cs.rochester.edu/~sadilek/publications/click.php?id=http://www.cs.rochester.edu/~sadilek/publications/Sadilek-Kautz-Bigham_Finding-Your-Friends-and-Following-Them-to-Where-You-Are_WSDM-12.pdf">interesting paper</a> recently by <a href="http://www.cs.rochester.edu/~sadilek/">Adam Sadilek</a> of the Department of Computer Science at the University of Rochester. It talked about <a href="http://www.newscientist.com/article/mg21328495.500-what-your-online-friends-reveal-about-where-you-are.html">how geotagged tweets could be used to locate individuals</a>, even if they themselves didn&#8217;t geotag their tweets &mdash; it was enough that their friends did so.</p>
<p class="image-box-580"><iframe width="580" height="325" src="http://www.youtube.com/embed/eepKvmknGr8" frameborder="0" allowfullscreen></iframe><br/><br />
<em>Geotagged messages on Twitter during a typical weekday afternoon in New York City.</em></p>
<p>The paper found that only a couple of weeks&#8217; worth of location data on an individual, combined with location data from their two most-sharing friends, was enough to place that person within a 100-meter radius with 77% accuracy. That rises to nearly 85% when you combine information from nine friends. </p>
<p>Even someone who has never shared their location at all can be pinpointed with 47% accuracy from information available from two friends. That goes up to 57% when you include nine friends.</p>
<h2 id="data-sharing">Data sharing</h2>
<p>There is a great debate going on right now, which is really only starting to surface into the mainstream press, about how we share data. Despite social networks becoming mainstream, the recent privacy debacles in the mobile space say a lot about how users perceive information privacy. I think Sadilek&#8217;s <a href="http://www.cs.rochester.edu/~sadilek/publications/click.php?id=http://www.cs.rochester.edu/~sadilek/publications/Sadilek-Kautz-Bigham_Finding-Your-Friends-and-Following-Them-to-Where-You-Are_WSDM-12.pdf">paper</a> presents even more compelling evidence.</p>
<p>For instance, I&#8217;m finding <a href="http://google.com/">Google&#8217;s</a> new <a href="http://support.google.com/mobile/bin/answer.py?hl=en&#038;answer=1304818">Instant Upload</a> feature, where photos taken on my phone are automatically uploaded to <a href="http://plus.google.com/">Google+</a> behind the scenes, a lot spookier and more worrying than I thought I would. It&#8217;s especially interesting that I&#8217;m feeling that way, as I&#8217;m using Apple&#8217;s <a href="http://www.apple.com/icloud/features/photo-stream.html">Photo Stream</a> without thinking or worrying about it that much.</p>
<p>I&#8217;m trying to figure out whether it&#8217;s because the privacy trade-off &mdash; in Apple&#8217;s case sharing my photos between all my devices, and in Google&#8217;s case making my photos more-or-less instantly available for sharing in Google+ &mdash; is more obviously in my favor with Photo Stream, or it&#8217;s for other reasons.</p>
<p>The interesting thing here is that Photo Stream and Instant Upload are, at least behind the scenes, effectively identical. Both are cloud services and your photos are stored in a data center somewhere. The master copies of your photos have essentially been moved to the cloud, rather than residing on your device.</p>
<p>However, because of the context these two services operate in, I have no problems with one, and I&#8217;m finding the other an uncomfortable fit. I think there&#8217;s a big lesson there for  people dealing with personal information. When you&#8217;re sharing someone&#8217;s information, even with their informed consent, the context is important about how they think about the implications surrounding that sharing.</p>
<h2 id="building-platforms">Building platforms</h2>
<p>So, all of this got me thinking. There are large personal datasets about me, and you, and everyone, being built up by large companies. But we&#8217;re also building up datasets about ourselves, in our own control. What happens if we mash them together? Can we actually do something productive?</p>
<p>I&#8217;m currently running an interesting experiment with my credit card and my iPhone. I&#8217;m scraping my bank&#8217;s website to grab transaction data in near real-time onto one of my servers. Each transaction comes with a postcode. This is like a U.S. zip code, but it normally specifies a much smaller neighborhood, perhaps down to a single street or smaller in a major urban area. </p>
<div align="center">
<p class="image-box-300"><img src="http://cdn.oreilly.com/radar/images/posts/0312-credit-card-app-iphone.jpg" width="300" alt="Watching credit card transactions in real-time on an iPhone" border="0" style="margin-bottom: 15px;" /><br />
<em>Watching my credit card transactions in real-time.</em></p>
</div>
<p>On my iPhone, I&#8217;m running an application that continually monitors my location using the <a href="https://developer.apple.com/library/ios/#documentation/CoreLocation/Reference/CLLocationManager_Class/CLLocationManager/CLLocationManager.html">Significant Location Change</a> service, so my phone knows my location to better than 1km (perhaps much better in a crowded city) more or less all of the time.</p>
<p>Every time a new transaction occurs, I forward it via push notification from the back-end server to my iPhone. Now, my iPhone knows both the location where the transaction took place and where I actually was at the time. If those locations don&#8217;t match, then this indicates there might have been a fraudulent transaction and it flags it for me with a notification.</p>
<p>The interesting thing here is that I&#8217;m using data that my credit card company doesn&#8217;t have, and hopefully will never have: my actual physical location when the transaction took place. They couldn&#8217;t possibly provide this service to me because they simply don&#8217;t have the data I have.</p>
<p>Of course, there are false positives. Online transactions in particular stand out. Most of these are tagged with a postcode of the headquarters of the company I&#8217;m dealing with. However, my next development step will be to give my back-end server code access to my inbox and allow it to scrape for online transaction receipts. This should reduce the false-positive rate down to something vanishingly small, and I should be able to deal with those left over with some sort of machine learning. After all, there&#8217;s a human-readable string attached to each transaction that details the retailer and sometimes other useful information.</p>
<h2 id="a-thought">A thought to ponder</h2>
<p>A thought to ponder in the dead of night: In the near future, the absence of data is going to be increasingly unusual. If you think the data exhaust you leave behind yourself is wide and varied, then just you wait, because we&#8217;re at the <a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">banging-the-rocks-together</a> stage right now.</p>
<p>If your data exhaust becomes assumed, what happens if you turn your phone off for an hour or two one night? What if you&#8217;re accused of a murder during that time period, and you can&#8217;t prove where you were? Perhaps in the future that&#8217;s going to be sufficiently unusual that it&#8217;s automatically suspicious. Innocent until proven guilty may underlie our current legal system, but that&#8217;s because our current legal system was codified in a very different era, one that was data poor rather than data rich. Perhaps in the future, the absence of data will imply guilt.</p>
<p class="image-box-580"><iframe width="580" height="325" src="http://www.youtube.com/embed/Xw5XFajQZZI" frameborder="0" allowfullscreen></iframe></p>
<p><em>I discussed hidden data in this interview at <a href="http://strataconf.com">Strata CA 2012</a>.</em></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2011/04/apple-location-tracking.html">Got an iPhone or 3G iPad? Apple is recording your moves</a> (April 2011)</li>
<li> <a href="http://radar.oreilly.com/2011/08/meat-to-math-ratio.html">The Meat to Math ratio</a></li>
<li> <a href="http://radar.oreilly.com/2010/12/strata-gems-what-your-inbox-knows.html">What your inbox knows</a></li>
<li> <a href="http://radar.oreilly.com/2011/02/big-data-metaphor.html">Big Data: An opportunity in search of a metaphor</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/03/hidden-data-exhaust-leakage-location.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Fighting the next mobile war</title>
		<link>http://radar.oreilly.com/2011/09/next-mobile-war-external-accessory.html</link>
		<comments>http://radar.oreilly.com/2011/09/next-mobile-war-external-accessory.html#comments</comments>
		<pubDate>Wed, 28 Sep 2011 14:00:00 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Mobile]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[android]]></category>
		<category><![CDATA[apple]]></category>
		<category><![CDATA[arduino]]></category>
		<category><![CDATA[external accessory]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[hacking]]></category>
		<category><![CDATA[ios]]></category>
		<category><![CDATA[iPad]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[makers]]></category>
		<category><![CDATA[sensors]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/09/next-mobile-war-external-accessory.html</guid>
		<description><![CDATA[While you&apos;ll likely interact with your smartphone tomorrow in much the same way you interacted with it today, it&apos;s quite possible that your smartphone will interact with the world in a very different way. The next mobile war has already begun. ]]></description>
				<content:encoded><![CDATA[<p>It&#8217;s arguable that with the arrival of touch displays, the current form factor for the smartphone is going to be with us for some time to come. You can&#8217;t get much simpler than a solid block of glass and aluminum with a button. Unless you remove the button. Thinking about it, that&#8217;s probably a solid suggestion &mdash; I&#8217;d look for that next.</p>
<p>If things aren&#8217;t going to change very much on the surface, underneath the glass things might not be much different either. Oh, the devices will be faster, and they&#8217;ll have more cores, better displays, faster network connections, and the batteries will last longer. But fundamentally, they&#8217;ll still be the same. The device won&#8217;t provide you with any new levers on the world. With the exception of <a href="http://developer.android.com/guide/topics/nfc/index.html">NFC</a>, which admittedly is a big exception, there are no new sensory modalities on the horizon that are likely to be integrated into handsets. You&#8217;ll interact with your smartphone tomorrow in much the same way you interact with it today, at least in the near term.</p>
<p>That said, it&#8217;s quite possible that your smartphone will interact with the <em>world</em> in a very different way. That&#8217;s because the next mobile war has already begun, and you&#8217;ve seen nothing yet.</p>
<h2>The phoney war</h2>
<p>It began quietly, with little noise or fanfare, just over two years ago with Apple&#8217;s announcement of iOS 3, the <a href="http://developer.apple.com/library/ios/#featuredarticles/ExternalAccessoryPT/Introduction/Introduction.html">External Accessory Framework</a>, and the opportunity for partners in the <a href="http://developer.apple.com/programs/mfi/">MFi program</a> to build external hardware that connected directly to the iPhone. </p>
<p>For the first time, it was easy, at least for certain values of easy, to build sensor hardware that connected to a mass-market mobile device. And for the first time, the mobile device had enough computing power and screen real estate to do something interesting with the sensor data.</p>
<p>Except of course, it wasn&#8217;t easy. While initially the External Accessory Framework was seen as having the potential to open up Apple&#8217;s platform to a host of external hardware and sensors, little of the innovation people were expecting actually occurred. Much of the blame was laid squarely at the feet of Apple&#8217;s own MFi program. </p>
<p>There was some headway made using the devices as sensor gateways, mainly <a href="http://www.airstriptech.com/">in the medical community</a>, which Apple had initially pushed heavily during the launch. But in the end, the framework was used to support a fairly predictable range of audio and video accessories from big-name manufacturers &mdash; although more recently there have been a few <a href="http://www.oscium.com/products/imso-104">notable exceptions</a>.</p>
<h2>Opening a second front</h2>
<p>Things stayed quiet until earlier this year when Google announced the Android <a href="http://developer.android.com/guide/topics/usb/adk.html">Accessory Development Kit</a> (ADK) at <a href="http://www.google.com/events/io/2011/">Google I/O</a> in May. </p>
<p>While there was a lot of <a href="http://romfont.com/2011/05/11/a-closer-look-at-googles-open-accessory-development-kit/">criticism of Google&#8217;s approach</a>, it was <a href="http://blog.makezine.com/archive/2011/05/why-google-choosing-arduino-matters-and-the-end-of-made-for-ipod-tm.html">justifiably hailed</a> as a disruptive move by Google in what had become a fairly stagnant accessories market. Philip Torrone hit the right note when he speculated that this might mean the end of Apple&#8217;s restrictive MFi program.</p>
<p>I&#8217;ve talked <a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">about the Arduino here before</a>. It allows rapid, cheap prototyping for embedded systems. Making Android the default platform for development of novel hardware was a brilliant move by Google. Maybe just a little too brilliant.</p>
<h2>The counterattack by Apple</h2>
<p>Around the middle of the year, right in the middle of Apple&#8217;s <a href="http://developer.apple.com/wwdc/">WWDC conference</a>, I was approached by <a href="http://redpark.com/">Redpark</a> and sworn to secrecy. Apple was on the brink of approving a <a href="http://redpark.com/c2db9.html">serial cable for iOS</a> that they would let Redpark sell into the hobbyist market.</p>
<p>I&#8217;d known about the existence of the cable since the preceding November with the release of the <a href="http://www.southernstars.com/products/skywire/index.html">SkyWire</a> telescope control kit. I&#8217;d begged Redpark for developer access to their cable, and after signing a thick stack of NDAs, I got my hands on one around mid-December. At the time there seemed little chance of Apple ever approving the cable except for specific use cases where the cable and an accompanying iOS application were approved together as part of the MFi program &mdash; exactly as Apple had for Skywire <a href="http://www.southernstars.com/products/skywire/index.html">for telescopes</a> and Cisco had <a href="http://www.redpark.com/c2rj45.html">for networking gear</a>.</p>
<p>The news that the cable might soon be generally available to hobbyists was surprising. Despite Apple&#8217;s beginnings &mdash; and the large community of indie developers surrounding its products &mdash; the hobbyist market isn&#8217;t something Apple is known for caring about these days. Quite the opposite: Apple is notorious for keeping its products as closed as possible.</p>
<p class="image-box-580"><iframe src="http://player.vimeo.com/video/26792608?title=0&amp;byline=0&amp;portrait=0" width="580" height="326" frameborder="0" webkitAllowFullScreen allowFullScreen></iframe></p>
<p>Controlling an Arduino with an iPhone.</p>
<p>Close on the heels of Google&#8217;s ADK announcement, Apple&#8217;s sudden enthusiasm was suspiciously timed. Someone high up at Apple had obviously realized the disruptive nature of the ADK and this was their response, their counter-attack. Despite the Android ADK actually being an Arduino, it was now easier to talk to an Arduino from iOS using Redpark&#8217;s cable than it was to talk to an Arduino from Android.</p>
<h2>The long war</h2>
<p>The Android ADK board is <a href="http://store.arduino.cc/eu/index.php?main_page=product_info&#038;cPath=11_12&#038;products_id=144">only now appearing</a> in large numbers as the open hardware community gears up to produce compatible boards cheaper than Google&#8217;s ruinously expensive initial batch of &#8220;official&#8221; developer boards.  The Redpark cable also faced supply issues, with the initial production run selling out <a href="http://www.makershed.com/ProductDetails.asp?ProductCode=MSRP01">on the Maker Shed</a> within a few days. We&#8217;re only now seeing it in larger volumes. So, despite appearances, it&#8217;s still the early days. </p>
<p class="image-box-580"><iframe width="580" height="325" src="http://www.youtube.com/embed/Lz33JpLUdjQ" frameborder="0" allowfullscreen></iframe></p>
<p>Discussing the Redpark cable at OSCON 2011.</p>
<p>I think the availability of both these products is going to prove to be amazingly disruptive in the longer term. After spending two days at the recent <a href="http://makerfaire.com/newyork/2011/">World Maker Faire</a> in New York, I know there&#8217;s a lot of enthusiasm inside the Maker community for that disruption &mdash; and Apple may have the edge. </p>
<p>Because of Apple&#8217;s policy restrictions, you can only develop applications that work with Redpark&#8217;s cable for your own personal use or for distribution inside an enterprise environment without going through the MFi program. The ease of use and popularity of the iOS platform with developers means there will still be a big uptake, and after a few people struggle through the process, I think that, with time, the cable will spell the end of the MFi program.</p>
<p>Over the next couple of years, we&#8217;ll be seeing some real innovation in the external accessory product space. Rapid prototyping combined with ease of access to increasingly powerful mobile platforms means that the next mobile war, and the <a href="http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html">next big thing</a> of a real ubiquitous computing environment, is just around the corner.</p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">Arduino is a building block for the world to come</a></li>
<li> <a href="http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html">The next, next big thing</a></li>
<li> <a href="http://radar.oreilly.com/2011/05/google-io-2011-5-things-to-watch.html#adk">Google I/O 2011: The Accessory Development Kit is a big deal</a></li>
<li> <a href="http://blog.makezine.com/archive/2011/05/why-google-choosing-arduino-matters-and-the-end-of-made-for-ipod-tm.html">Why Google Choosing Arduino Matters</a></li>
<li> <a href="http://answers.oreilly.com/topic/1624-parallel-programming-arduino-and-the-good-kind-of-trouble/">Parallel programming, Arduino and the good kind of trouble</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/09/next-mobile-war-external-accessory.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apple and a web-free cloud</title>
		<link>http://radar.oreilly.com/2011/06/apple-icloud-control-google-amazon.html</link>
		<comments>http://radar.oreilly.com/2011/06/apple-icloud-control-google-amazon.html#comments</comments>
		<pubDate>Thu, 16 Jun 2011 15:00:00 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Mobile]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[apple]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[platform]]></category>
		<category><![CDATA[services]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/06/apple-icloud-control-google-amazon.html</guid>
		<description><![CDATA[From custom chips, to the data centers backing its new iCloud effort, Apple is committed to controlling the end-user experience. The web has no place in their vision. ]]></description>
				<content:encoded><![CDATA[<p><img src="http://s.radar.oreilly.com/2011/06/08/0611-icloud.png" border="0" alt="iCloud" width="200" style="float: right;margin: 3px 0 10px 10px" />The nature of <a href="http://www.apple.com/">Apple&#8217;s</a> new <a href="http://www.apple.com/icloud/">iCloud</a> service, announced at WWDC, is perhaps more interesting than it seems. It hints very firmly at the company&#8217;s longer-term strategy; a strategy that doesn&#8217;t involve the web.</p>
<p>Apple will join <a href="http://www.google.com/">Google</a> and <a href="http://aws.amazon.com/">Amazon</a> as a major player in cloud computing. The 200 million iTunes users Apple brings with them puts the company on the same level as those other platforms. Despite that, the three companies obviously see the cloud in very different ways, and as a result have very different strategies.</p>
<p>Amazon is the odd man out. Their cloud offering is bare metal, contrasting sharply with Google, and now Apple&#8217;s, document-based model. To be fair, Amazon&#8217;s target market is very different, with their focus on service providers. If you&#8217;re a Valley start-up looking for storage and servers, you need look no further than Amazon&#8217;s Web Services platform.</p>
<p>Google and Apple&#8217;s document model contrasts sharply with Amazon&#8217;s service-stack approach. Both Google and Apple have attempted to abstract away things, like the file system, which stand between the end user and their data. An unsurprising difference perhaps, Google and Apple are consumer-facing companies that are marketing to the final end user rather than the people and companies who aim to provide services <em>for</em> those users.</p>
<p>But that&#8217;s where the similarity between Google and Apple breaks down. Google sees the cloud as a way to deprecate general purpose computers in the hands of their users. In the same way that their new <a href="http://www.chromium.org/chromium-os">Chromium OS</a> is built for the web, their cloud strategy is an attempt to move Google&#8217;s users away from native applications so that their applications and data live in Google&#8217;s cloud of services.  Perhaps coincidentally, this also gives Google the chance to display and target their advertising even more cleverly.</p>
<p>Apple&#8217;s approach is almost entirely  the opposite. They see the cloud as a way to keep the general purpose computer on life support for a few more years until touch-based hardware is really ready to take over. Apple&#8217;s new cloud platform is <a href="http://diogenex.tumblr.com/">built for native applications</a>, in an attempt to pull users into native apps designed for their platforms. This method also gives Apple the chance to sell hardware, applications, and content that will lock users into their platform even more firmly. This is the basis of the often remarked &#8220;<a href="http://en.wikipedia.org/wiki/Halo_effect">halo effect</a>.&#8221;</p>
<p>At least on the surface things seem to be simple &mdash; the &#8220;why&#8221; of the thing is not in question. However it&#8217;s what hasn&#8217;t been said, at least openly, that raises the most interesting questions.</p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px"><a href="https://en.oreilly.com/web2011/public/regwith/radar?cmp=il-radar-wb11-alasdair-icloud"><img style="float: left;border: none;padding-right: 10px" src="http://s.radar.oreilly.com/web2summit11-code-radar.png" /></a><a href="https://en.oreilly.com/web2011/public/regwith/radar?cmp=il-radar-wb11-alasdair-icloud"><strong>Web 2.0 Summit</strong></a>, being held October 17-19 in San Francisco, will examine &#8220;The Data Frame&#8221; &mdash; focusing on the impact of data in today&#8217;s networked economy.</p>
<p><a href="https://en.oreilly.com/web2011/public/regwith/radar?cmp=il-radar-wb11-alasdair-icloud"><strong>Save $300 on registration with the code RADAR</strong></a></div>
<p>Apple is fundamentally platform orientated. It&#8217;s deep in their company genetics. The ill-fated <a href="http://en.wikipedia.org/wiki/Macintosh_clone">official cloning program</a> from the mid-&#8217;90s, which was brought to a screeching halt by the <a href="http://en.wikipedia.org/wiki/Steve_Jobs#Return_to_Apple">return of Steve Jobs</a>, seems to have set a deep fear inside the company about letting someone else control anything that might stand between the company and direct access to their customers.</p>
<p>At least to me, nothing confirms that mindset more than Apple&#8217;s return to designing their own processors in-house in Cupertino. Apple has <a href="http://www.roughlydrafted.com/2008/04/28/how-apples-pa-semi-acquisition-fits-into-its-chip-history/">a long history</a> of using its own custom silicon, but it&#8217;s been more than five years since Apple has done so. With the move to Intel, the hope was to delegate nearly all of Apple&#8217;s custom chip development. Unfortunately, that proved to be a stumbling block when Apple built the first generation iPhone. The <a href="http://www.samsung.com/">Samsung</a> H1 processor in the <a href="http://en.wikipedia.org/wiki/IPhone_(original)">original model</a> wasn&#8217;t quite what Apple wanted, even though it was what had been asked for, and I think the return to custom silicon probably brought a sigh of relief in some corners of the company.</p>
<p>The link between custom chips and the cloud may seem tenuous at first glance, but I think Apple&#8217;s return to designing their own silicon is telling. Almost as telling as spending half a billion dollars on a custom <a href="http://maps.google.com/maps/place?cid=4094693220179467430&amp;q=Apple,+Maiden,+North+Carolina&amp;hl=en&amp;sll=35.580404,-81.272662&amp;sspn=0.817301,0.929825&amp;ie=UTF8&amp;ll=37.905199,-122.76123&amp;spn=0,0&amp;z=10">data center</a> to support their new iCloud service. Both moves show the company is now committed more than ever to controlling the verticals. From the chips inside the devices to the data centers their customers&#8217; data ultimately resides on, Apple is committed to controlling the user experience, and the web has no place in that.</p>
<p>You might argue that this is because the web is &#8220;too open&#8221; and that threatens Apple&#8217;s platform. However, the continuing argument over openness, or lack there of, isn&#8217;t really relevant. Despite Google&#8217;s protestations to the contrary, neither of these two companies is particularly open. The very document-based model they&#8217;re both advocating in their cloud architectures precludes a truly open system. It&#8217;s such an obvious straw man argument that it&#8217;s not actually that interesting.</p>
<p>What is interesting is that there was little or no mention of the web, or <a href="http://en.wikipedia.org/wiki/HTML5">HTML5</a>, during Apple&#8217;s<br />
WWDC keynote. I think you&#8217;ll see far less emphasis on HTML5 from Apple in the future, unless someone asks to do something with Apple&#8217;s platform the company disapproves of, and then the traditional answer of &#8220;Well, you can<br />
always do that in HTML5&#8243; will be rolled out again.
<p>Apple has finally put their cards on the table. They have not yet bet the company on iCloud, but it&#8217;s telling how deep the integration into both iOS and OS X appears to be. They have for too much invested in iCloud for it to fail, if only in reputation. Whether the first incarnation lives up to its promises out of the box is still to be seen, but success isn&#8217;t out of the question. Despite <a href="http://www.me.com/">MobileMe</a>, Apple does know how to build large-scale reliable backend services. You only have to look at the App Store itself for an example.</p>
<p>So in the future don&#8217;t be too surprised to see Apple integrate iCloud even more tightly with both iOS and OS X. For the same strategic reasons, don&#8217;t be shocked to see more custom chips appear &mdash; I expect to see the arrival of ARM-based MacBooks and the transition away from Intel for Apple&#8217;s laptops. That&#8217;s because for Apple, It&#8217;s all about the platform.</p>
<p></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2011/06/apple-wwdc-2011-postpc.html">Four core takeaways from Apple&#8217;s WWDC keynote</a></li>
<li> <a href="http://radar.oreilly.com/2011/06/devwir-apple-wwdc-ios.html#icloud">iClouds on the horizon</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/06/apple-icloud-control-google-amazon.html/feed</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>The next, next big thing</title>
		<link>http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html</link>
		<comments>http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html#comments</comments>
		<pubDate>Thu, 19 May 2011 14:00:00 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[apps]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[development]]></category>
		<category><![CDATA[embedded devices]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html</guid>
		<description><![CDATA[Those evangelizing the revolutionary qualities of &#34;the next big thing&#34; (whatever it may be) would do well to revisit past &#34;big things.&#34; Truth is, computing goes in cycles.  ]]></description>
				<content:encoded><![CDATA[<p>In my old age, at least for the computing industry, I&#8217;m getting more irritated by smart young things that preach today&#8217;s big thing, or tomorrow&#8217;s next big thing, as the best and only solution to my computing problems.</p>
<p>Those that fail to learn from history are doomed to repeat it, and the smart young things need to pay more attention. Because the trends underlying today&#8217;s computing should be evident to anyone with a sufficiently good grasp of computing history.</p>
<p>Depending on the state of technology, the computer industry oscillates between thin- and thick-client architectures. Either the bulk of our compute power and storage is hidden away in racks of (sometimes distant) servers, or alternatively, into a mass of distributed systems closer to home. This year&#8217;s reinvention of the mainframe is called cloud computing. While I&#8217;m a big supporter of cloud architectures, at least at the moment, I&#8217;ll be interested to see those preaching it as a last and final solution of all our problems proved wrong, yet again, when computing power catches up to demand once more and you can fit today&#8217;s data center inside a box not much bigger than a cell phone.</p>
<p>Thinking that just couldn&#8217;t happen? You should think again, because it already has. The iPad 2 <a href="http://www.tuaw.com/2011/05/09/ipad-2-would-have-bested-1990s-era-supercomputers/">beats most super computers from the early &#8217;90s</a> in raw compute power, and it would have been on the world-wide top 500 list of super computers well into 1994. There isn&#8217;t any reason to suspect that, at least for now, that sort of trend isn&#8217;t going to continue.</p>
</p>
<h2>Yesterday&#8217;s next big thing</h2>
</p>
<p>Yesterday&#8217;s &#8220;next big thing&#8221; was the World Wide Web. I still vividly remember standing in a draughty computing lab, almost 20 years ago now, looking over the shoulder of someone who had just downloaded first public build of <a href="http://www.ncsa.illinois.edu/Projects/mosaic.html">NCSA Mosaic</a> via some torturous method. I shook my head and said &#8220;It&#8217;ll never catch on, why would you want images?&#8221; That shows what I know. Although to be fair, I was a lot younger back then. I was failing to grasp history because I was neither well read enough, nor old enough, to have seen it all before. And since I still don&#8217;t claim to be either well read or old enough this time around, perhaps you should take everything I&#8217;m saying with a pinch of salt. That&#8217;s the thing with the next big thing: it&#8217;s always open to interpretation.</p>
</p>
<h2>The next big thing?</h2>
</p>
<p>The machines we grew up with are yesterday&#8217;s news. They&#8217;re quickly being replaced by consumption devices, with most of the rest of day-to-day computing moving into the environment and becoming embedded into people&#8217;s lives. This will happen almost certainly<br />
without people noticing.</p>
<p>While it&#8217;s pretty obvious that mobile is the current &#8220;next&#8221; big thing, it&#8217;s arguable whether  mobile itself has already peaked. The sleek lines of the iPhone in your pocket are already almost as dated as the beige tower that used to sit next to the CRT on your desk.</p>
<p>Technology has not quite caught up to the overall vision and neither have we &mdash; we&#8217;ve been trying to reinvent the desktop computer in a smaller form factor. That&#8217;s why the mobile platforms we see today are just stepping stones.</p>
<p>Most people just want gadgets that work, and that do the things they want them to do. People never really wanted computers. They wanted what computers could do for them. The general purpose machines we think of today as &#8220;computers&#8221; will naturally dissipate out into the environment as our technology gets better. </p>
</p>
<h2>The next, next big thing</h2>
</p>
<p>To those preaching cloud computing and web applications as the next big thing: they&#8217;ve already had their day and the web as we know it is a dead man walking. Looking at the job board at <a href="http://strataconf.com">O&#8217;Reilly&#8217;s Strata conference</a> earlier in the year, the next big thing is obvious. It&#8217;s data. Heck, it&#8217;s not even the next big thing anymore.  It&#8217;s pulling into the station, and to <a href="http://radar.oreilly.com/2011/05/data-science-terminology.html">data scientists</a>, the web and its architecture is just a commodity. Bought and sold in bulk.</p>
<p class="image-box-580"><img src="http://s.radar.oreilly.com/wp-files/2/2011/05/5411925800_741ae7857e_b.jpg" width="580" border="0" alt="Strata job board" style="margin-bottom: 15px" /><br />The overflowing job board at February&#8217;s Strata conference.</p>
<p>As for the next, next big thing? Ubiquitous computing is the thing after the next big thing, and almost inevitably the thirst for more data will drive it. But then eventually, inevitably, the data will become secondary &mdash; a commodity. Yesterday&#8217;s hot job was a developer, today with the arrival of <a href="http://radar.oreilly.com/2010/06/what-is-data-science.html#data-scale">Big Data</a> it has become a mathematician. Tomorrow it could well be a hardware hacker.</p>
<p>Count on it. History goes in cycles and only the names change.</p>
<p></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html">Arduino is a building block for the world to come.</a></li>
<li> <a href="http://radar.oreilly.com/2011/03/abandonment-of-technology.html">The abandonment of technology</a></li>
<li> <a href="http://radar.oreilly.com/2011/03/personal-area-network.html">The return of the Personal Area Network</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/05/next-big-thing-web-mobile-data-ubiquitious-computing.html/feed</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>The secret is to bang the rocks together</title>
		<link>http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html</link>
		<comments>http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html#comments</comments>
		<pubDate>Fri, 13 May 2011 13:00:00 +0000</pubDate>
		<dc:creator>Alasdair Allan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[arduino]]></category>
		<category><![CDATA[embedded systems]]></category>
		<category><![CDATA[makers]]></category>
		<category><![CDATA[ubiquitous computing]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/05/arduino-open-hardware-movement.html</guid>
		<description><![CDATA[Every so often a piece of technology can become a lever that lets people move the world, just a little bit. The Arduino is one of those levers.  ]]></description>
				<content:encoded><![CDATA[<p><em>&#8220;We&#8217;ll be saying a big hello to all intelligent lifeforms everywhere and to everyone else out there, the secret is to bang the rocks together, guys.&#8221;</em> &mdash; &#8220;<a href="http://www.douglasadams.com/creations/0345391802.html">The Hitchhiker&#8217;s Guide to the Galaxy</a>,&#8221; Douglas Adams</p>
<hr />
<p>Every so often a piece of technology can become a lever that lets people move the world, just a little bit. The <a href="http://www.arduino.cc/">Arduino</a> is one of those levers. </p>
<p>It started off as a project to give artists access to embedded micro-processors for interaction design projects, but I think it&#8217;s going to end up in a museum as one of the building blocks of the modern world. It allows rapid, cheap, prototyping for embedded systems. It turns what used to be fairly tough hardware problems into simpler software problems.</p>
<div align="center">
<p class="image-box-480"><img src="http://s.radar.oreilly.com/assets_c/2011/03/ArduinoUnoFront-thumb-486x362.jpg" width="480" style="margin-bottom: 15px" alt="The Arduino UNO" /><br />The Arduino UNO.</p>
</div>
<p>The Arduino, and the open hardware movement that has grown up with it, and at least to certain extent around it, is enabling a <a href="http://spectrum.ieee.org/consumer-electronics/gadgets/the-hobbyist-renaissance">generation of high-tech tinkerers</a> both to break the seals on proprietary technology, and prototype new ideas with fairly minimal hardware knowledge. This maker renaissance has led to an interesting growth in innovation. People aren&#8217;t just having ideas, they&#8217;re doing something with them.</p>
</p>
<h2>Goodbye desktop</h2>
</p>
<p>The underlying trend is clear. The general purpose computer is a dead end. Most people just want gadgets that work, and that do the things they want them to do. They never really wanted computers. They wanted what computers could do for them.</p>
<p>While general purpose computers will live on, like the horse after the arrival of the automobile, these systems will be relegated to two  small niches. Those of us that build the embedded systems people are using elsewhere will still have a need for general purpose computers, as will those who can&#8217;t resist tinkering. But that&#8217;s the extent of it. Nobody else will need them. Quite frankly, nobody else will want them.</p>
<p>The humble Arduino is the start of that. The board has multiple-form factors, but a single-programming interface. Sizes range from the &#8220;standard&#8221; palm of your hand for prototyping, down to the size of your thumb for the almost-professional almost-products now starting to come out of the maker renaissance. Arduino, and its relatives, will be part of everything from wearable versions like the <a href="http://www.arduino.cc/en/Main/ArduinoBoardLilyPad">Lilypad</a>, sized and customized to be stitched into clothing, to <a href="http://blog.makezine.com/archive/2011/05/why-google-choosing-arduino-matters-and-the-end-of-made-for-ipod-tm.html">mobile phone hardware accessories</a>, to <a href="http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1236996885">specially built boards launched into space</a> on the new generation of nano-satellites built on a shoe-string budget by hobbyists.</p>
<p>Every interesting hardware prototype to come along seems to boast that it is Arduino-compatible, or just plain built on top of an Arduino. It&#8217;s everywhere.</p>
<div style="border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px 20px 20px 10px;margin: 20px 2px"><a href="http://makerfaire.com/bayarea/2011/"><img style="float: left;border: none;padding: 0 15px 0 0" src="http://s.radar.oreilly.com/2011/05/02/0511-maker-faire-promo.png" /></a><br />
<a href="http://makerfaire.com/"><strong>Maker Faire Bay Area</strong></a> will be held May 21-22 in San Mateo, Calif. Event details, exhibitor profiles, and ticket information can be found at the <a href="http://makerfaire.com/bayarea/2011/">Maker Faire site</a>.</div>
</p>
<h2>Things are still open. They&#8217;re just different things.</h2>
</p>
<p>There has been a great deal of fear-mongering about the demise of the general purpose computer and the emergence of a new generation of  consumption devices as more-or-less closed platforms. When the iPad made its debut, Cory Doctorow <a href="http://www.boingboing.net/2010/04/02/why-i-wont-buy-an-ipad-and-think-you-shouldnt-either.html">argued</a> that closed platforms send the wrong signal:</p>
<blockquote><p>Buying an iPad for your kids isn&#8217;t a means of jump-starting the realization that the world is yours to take apart and reassemble; it&#8217;s a way of telling your offspring that even changing the batteries is something you have to leave to the professionals.</p>
</blockquote>
<p>I&#8217;m philosophical about the passing of the computer. What we&#8217;re seeing here is a transition from one model of computing to another. We&#8217;ve seen that before and there were similar outcries for the death of the mainframe, as there has been for the death of the desktop. There is plenty of room for closed platforms, but the underlying trend is toward more openness, not less. It&#8217;s just the things that are open and the things that are closed are changing. The skills needed to work with the technology are changing as well. </p>
<p>What the Arduino and the open hardware movement have done is made hard things easy, and impossible things merely hard. Before now, getting to the prototype stage for a hardware project was hard, at least for most people, and going beyond a crude prototype was impossible for many. Now it&#8217;s the next big thing. </p>
<p></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://blog.makezine.com/archive/2011/05/why-google-choosing-arduino-matters-and-the-end-of-made-for-ipod-tm.html">Why Google Choosing Arduino Matters</a></li>
<li> <a href="http://answers.oreilly.com/topic/1624-parallel-programming-arduino-and-the-good-kind-of-trouble/">Parallel programming, Arduino and the good kind of trouble</a></li>
<li> <a href="http://answers.oreilly.com/topic/1866-what-you-can-do-with-processing-and-arduino/">What you can do with Processing and Arduino</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/05/arduino-open-hardware-movement.html/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
