<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>O&#039;Reilly Radar &#187; Julie Steele</title>
	<atom:link href="http://radar.oreilly.com/julies/feed" rel="self" type="application/rss+xml" />
	<link>http://radar.oreilly.com</link>
	<description>Insight, analysis, and research about emerging technologies</description>
	<lastBuildDate>Mon, 20 May 2013 11:00:26 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Eyebeam Update: Two months after Sandy</title>
		<link>http://radar.oreilly.com/2013/01/eyebeam-update-two-months-after-sandy.html</link>
		<comments>http://radar.oreilly.com/2013/01/eyebeam-update-two-months-after-sandy.html#comments</comments>
		<pubDate>Fri, 18 Jan 2013 15:00:33 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[art and technology]]></category>
		<category><![CDATA[donation]]></category>
		<category><![CDATA[Eyebeam]]></category>
		<category><![CDATA[Hurricane Sandy]]></category>
		<category><![CDATA[new media]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=55232</guid>
		<description><![CDATA[A couple of months ago, I wrote about the new media and design incubator in NYC, Eyebeam, and the damage they&#8217;d suffered in Hurricane Sandy. This week I caught up with Eyebeam executive director Pat Jones to find out what &#8230; ]]></description>
				<content:encoded><![CDATA[<p>A couple of months ago, <a title="Eyebeam Radar piece" href="http://radar.oreilly.com/2012/11/eyebeam-recovery-hurricane-sandy.html">I wrote about</a> the new media and design incubator in NYC, <a title="About Eyebeam" href="http://eyebeam.org/about">Eyebeam</a>, and the damage they&#8217;d suffered in Hurricane Sandy. This week I caught up with Eyebeam executive director Pat Jones to find out what kind of progress has been made in the cleanup.</p>
<p>The three feet of polluted water that flooded Eyebeam&#8217;s work and exhibit space on the West Side of Manhattan during the storm had damaged lots of equipment and soaked much of their archive material. Cleaning whatever could be saved and making a priority list for replacing what was lost were the two main challenges of recovery.</p>
<p>Thanks to generous contributions from philanthropic foundations and private companies — such as <a title="Jerome Foundation homepage" href="http://www.jeromefdn.org/">the Jerome Foundation</a>, the <a title="Warhol Foundation homepage" href="http://www.warholfoundation.org/">Andy Warhol Foundation for the Visual Arts</a>, the <a title="Mellon Foundation homepage" href="http://www.mellon.org/">Andrew W. Mellon Foundation</a>, the <a title="ADAA homepage" href="http://www.artdealers.org/">Art Dealers Association of America</a>, <a title="Time-Warner homepage" href="http://www.timewarner.com/">Time-Warner</a>, and <a title="O'Reilly Media homepage" href="http://oreilly.com/">O’Reilly Media</a> — as well as donations from individuals, about half the equipment losses have been covered.</p>
<p>There are still some other funding proposals under consideration, but essentially the equipment recovery process has been one of triage: Eyebeam has tried to replace equipment needed immediately by their artists in residence, as well as some practical pieces like the two scissor lifts required to access lights and other equipment at the top of their two-story exhibit space.<span id="more-55232"></span></p>
<p>After replacing some of the equipment, such as computers and monitors, that artists needed right away to do their work, the next priority was to replace the tools needed to put on exhibits for the public — for instance, projectors. Of their former fleet of 10-12 projectors, they have so far replaced four, though they’ll need to replace a few more soon. Bigger pieces will have to wait, though <a title="Makerbot homepage" href="http://www.makerbot.com/">MakerBot</a> has loaned the non-profit a Replicator.</p>
<p>Although the equipment restoration is happening bit by bit, working with half of what they once had has slowed down the rebuilding of the space; it will probably take until the end of the year or so before the organization is fully back up to speed. <a title="Eyebeam donation page" href="https://salsa.democracyinaction.org/o/528/donate_page/donate-2012">Donations</a> are certainly still needed and very much welcomed.</p>
<p>The other challenge Eyebeam is facing is the restoration of their archives. Thanks to the volunteers who came in right after the hurricane, the cleaning process happened right away, which helped enormously. Specialists washed salt and oil and other things that could cause damage from tapes, drives, and discs, and whatever was salvageable has been stabilized. What remains is to — carefully — transfer the information from these older formats to a longer-lasting digital format.</p>
<p>Of the work that was submerged, some of it was on media that was fragile in the best of circumstances, so those are of higher priority to transfer first. With some of the material, like VHS and mini-DV tapes, they won’t know how much damage there has been until transfer is attempted. So the transfer needs to be done professionally and delicately, and that’s a very expensive process.</p>
<p>To help with that process, Eyebeam has applied to the Institute of Museum and Library Services (a government agency) for a grant that would provide monetary support for a strategic plan to move forward. They&#8217;ve put together an advisory committee that includes several high-profile experts from institutions such as <a title="MoMA Conservation" href="http://www.moma.org/explore/collection/conservation/index#about">MoMA</a>, <a title="Rhizome About" href="http://rhizome.org/about/#mission">Rhizome</a>, and <a title="Cornell Media Library" href="http://lrc.cornell.edu/medialib">Cornell’s Media Library</a>. If the funding is granted, the project will start in October.</p>
<p>Archival media conservation is, of course, a challenge faced by many organizations all over the world. Says Jones, &#8220;Hopefully, not only will this plan help us, but it will also be something we can pass on to other institutions.&#8221;</p>
<p>While no one would say they were glad for Hurricane Sandy, it has clearly forced some important questions about priorities and practices. Eyebeam already has a legacy as a place where advances have been made over the last 15 years. Ensuring that it will be a place where people can access what has been done in digital media during this important period, and into the future, will benefit all of us.</p>
<p><strong>Related:</strong></p>
<ul>
<li><a href="http://radar.oreilly.com/2012/11/eyebeam-recovery-hurricane-sandy.html">After the storm: Putting Eyebeam back together</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2013/01/eyebeam-update-two-months-after-sandy.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>After the storm: Putting Eyebeam back together</title>
		<link>http://radar.oreilly.com/2012/11/eyebeam-recovery-hurricane-sandy.html</link>
		<comments>http://radar.oreilly.com/2012/11/eyebeam-recovery-hurricane-sandy.html#comments</comments>
		<pubDate>Wed, 28 Nov 2012 17:14:50 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[art and technology]]></category>
		<category><![CDATA[donation]]></category>
		<category><![CDATA[Eyebeam]]></category>
		<category><![CDATA[Hurricane Sandy]]></category>
		<category><![CDATA[new media]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=54325</guid>
		<description><![CDATA[Thanksgiving has come and gone and many of us are busy preparing for the winter holidays. For most of us, Hurricane Sandy is about to become a footnote to a crazy series of news cycles around the 2012 presidential election. &#8230; ]]></description>
				<content:encoded><![CDATA[<p>Thanksgiving has come and gone and many of us are busy preparing for the winter holidays. For most of us, <a title="Hurricane Sandy" href="http://en.wikipedia.org/wiki/Hurricane_Sandy">Hurricane Sandy</a> is about to become a footnote to a crazy series of news cycles around the 2012 presidential election. But for many individuals and institutions, the cleanup has barely begun.</p>
<p>One of these institutions is the <a title="About Eyebeam" href="http://eyebeam.org/about">Eyebeam Art + Technology Center</a> in New York, a not-for-profit incubator of new media and design. Each year, Eyebeam hosts two groups of residents for five months each, in addition to several fellows for the full year. Almost 250 artists, designers, and technologists have spent time there since Eyebeam first opened in 1997, many of whom we at O’Reilly have known and admired.</p>
<p>Hurricane Sandy brought more than three feet of water, chemicals, and outside debris sweeping into the streets and buildings on the west side of Manhattan. At Eyebeam, this spelled disaster for much of the equipment and archives on their ground floor. The disaster was compounded by the fact that none of this material was covered by flood insurance.</p>
<p>The main space is currently filled with dried-out computers, projectors, mixers and other audio equipment, and two scissor-lifts used for accessing the upper reaches of their 18-foot-high space. Volunteers are looking over and hand-inspecting it all to figure out what can be salvaged; ultimately, they’ll have to find ways to repair, replace, or do without each item. </p>
<p>As for the archives, volunteers have sorted and washed each piece, and begun the task of cataloging what can be preserved. Alumni will be contacted to see if they can provide copies of their work, but most of the material is now very fragile and will need to be digitized and transferred to more stable formats, a time-consuming and expensive process that is expected to take a year or more.</p>
<p>But with all of this comes the opportunity for Eyebeam to reconsider their goals and how they can make their archives available to a wider audience than before.<span id="more-54325"></span></p>
<p><iframe src="http://player.vimeo.com/video/53849333" width="620" height="349" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe></p>
<p>Events at Eyebeam will resume this week with an <a title="ArtsTech Meetup" href="http://www.meetup.com/Arts-Culture-and-Technology/events/91525302/">#ArtsTech Meetup on 3D printing</a>; larger public events will resume in December. They’re also planning a special series of events to present material from the archives during the week of January 7. Stay tuned to their <a title="Eyebeam events" href="http://eyebeam.org/events">events page</a> for forthcoming details on that series.</p>
<p>In the meantime, please consider <a title="Eyebeam donation page" href="https://salsa.democracyinaction.org/o/528/donate_page/donate-2012">making a donation</a>. Some of you may have enjoyed Eyebeam’s space when they hosted our <a title="Data After Dark" href="http://eyebeam.org/events/explore-data-after-dark">Data After Dark</a> event celebrating data visualization as part of the <a title="Strata NY 2011" href="http://strataconf.com/stratany2011">2011 Strata conference in New York</a>. We at O’Reilly appreciate and support all they add to the tech community, and hope you will too.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/11/eyebeam-recovery-hurricane-sandy.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Printing ourselves</title>
		<link>http://radar.oreilly.com/2012/11/3d-printing-healthcare-magic.html</link>
		<comments>http://radar.oreilly.com/2012/11/3d-printing-healthcare-magic.html#comments</comments>
		<pubDate>Tue, 27 Nov 2012 15:15:35 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[3D]]></category>
		<category><![CDATA[3d printers]]></category>
		<category><![CDATA[3d printing]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[health care]]></category>
		<category><![CDATA[magic]]></category>
		<category><![CDATA[Strata Rx]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=54304</guid>
		<description><![CDATA[Tim O&#8217;Reilly recently asked me and some other colleagues which technology seems most like magic to us. There was a thoughtful pause as we each considered the amazing innovations we read about and interact with every day. I didn&#8217;t have &#8230; ]]></description>
				<content:encoded><![CDATA[<p><a title="Tim O'Reilly" href="http://radar.oreilly.com/tim">Tim O&#8217;Reilly</a> recently asked me and some other colleagues which technology seems most like magic to us. There was a thoughtful pause as we each considered the amazing innovations we read about and interact with every day.</p>
<p>I didn&#8217;t have to think for long. To me, the thing that seems most like magic isn&#8217;t <a title="Siri" href="http://www.apple.com/ios/siri/">Siri</a> or <a title="Google driverless car" href="http://en.wikipedia.org/wiki/Google_driverless_car">self-driving cars</a> or <a title="Google Project Glass" href="https://plus.google.com/+projectglass/posts">augmented reality displays</a>. It&#8217;s 3D printing.</p>
<p>My reasons are different than you might think. Yes, it&#8217;s amazing that, with very little skill, we can manufacture complex objects in our homes and workshops that are made from things like <a title="Thermoplastics" href="http://www.stratasys.com/Resources/White-Papers/Thermoplastics-the-Best-Choice-for-3D-Printing.aspx">plastic</a> or <a title="Laywoo-D3" href="http://www.youtube.com/watch?feature=player_embedded&amp;v=8pZyrb_FA8U">wood</a> or <a title="Choclate" href="http://www.youtube.com/watch?v=BIFi8but3Vw">chocolate</a> or even <a title="Titanium" href="http://www.youtube.com/watch?v=E7--ZWPVVdQ">titanium</a>. This seems an amazing act of conjuring that, just a short time ago, would have been difficult to imagine outside of the &#8220;Star Trek&#8221; set.</p>
<p>But the thing that makes 3D printing really special is the magic it allows us to perform: the technology is capable of making us more human.<span id="more-54304"></span></p>
<p>I recently had the opportunity to lay out this idea in an Ignite talk at <a title="Strata Rx" href="http://strataconf.com/rx2012">Strata Rx</a>, a new conference on data science and health care that I chaired with Colin Hill. Here&#8217;s the talk I gave there (don&#8217;t worry: like all <a title="Ignite" href="http://en.wikipedia.org/wiki/Ignite_%28event%29">Ignite</a> talks, it&#8217;s only five minutes long).</p>
<p><iframe width="620" height="349" src="http://www.youtube.com/embed/hLTFSTNY5lU?feature=oembed" frameborder="0" allowfullscreen></iframe></p>
<p>In addition to the applications mentioned in my talk, there are even more amazing accomplishments just over the horizon. <a title="Anthony Atala" href="http://www.wakehealth.edu/Faculty/Atala-Anthony-J.htm?LangType=1033">Doctor Anthony Atala</a>, of the Wake Forest University School of Medicine, recently <a title="Anthony Atala TED" href="http://www.ted.com/talks/anthony_atala_printing_a_human_kidney.html">printed a human kidney onstage at TED</a>.</p>
<p>This was not actually a working kidney &mdash; one of the challenges to creating working organs is building blood vessels that can provide cells on the inside of the organ structure with nutrients; right now, the cells inside larger structures tend to die rapidly. But researchers at MIT and the University of Pennsylvania are experimenting with <a title="Sugar vessels" href="http://www.upenn.edu/pennnews/news/penn-researchers-improve-living-tissues-3d-printed-vascular-networks-made-sugar">printing these vessel networks in sugar</a>. Cells can be grown around the networks, and then the sugar can be dissolved, leaving a void through which blood could flow. As printer resolution improves, these networks can become finer.</p>
<p>And 3D printing becomes even more powerful when combined with other technologies. For example, researchers at the Wake Forest Institute of Regenerative Medicine are using a <a title="3D printed cartilage" href="http://www.wired.com/design/2012/11/3-d-printed-cartilage/">hybrid 3D printing/electrospinning technique</a> to print replacement cartilage.</p>
<p>As practiced by <a title="Bespoke Innovations" href="http://www.bespokeinnovations.com/">Bespoke Innovations</a>, the <a title="WREX" href="http://www.nemours.org/content/nemours/www/research/labcenter/orthopedics/perl/wrex.html">WREX</a> team, and <a title="Titanium jaw" href="http://www.wired.co.uk/news/archive/2012-02/06/3d-printed-jaw">others</a> , 3D printing requires a very advanced and carefully honed skillset; it is not yet within reach of the average DIYer. But what is so amazing &mdash; what makes it magic &mdash; is that when used in these ways at such a level, the technology disappears. You don&#8217;t really see it, not unless you&#8217;re looking. What you see is the person it benefits.</p>
<p>Technology that augments us, that makes us more than we are even at our best (such as self-driving cars or sophisticated digital assistants) is a neat party trick, and an homage to our superheros. But those that are superhuman are not like us; they are Other. Every story, from Superman to the X-Men to the Watchmen, includes an element of struggle with what it means to be more than human. In short, it means outsider status.</p>
<p>We are never more acutely aware of our own humanity, and all the frailty that entails, as when we are sick or injured. When we can use technology such as 3D printing to make us more whole, then it makes us more human, not Other. It restores our insider status.</p>
<p>Ask anyone who has lost something truly precious and then found it again. I&#8217;m talking on the level of an arm, a leg, a kidney, a jaw. If that doesn&#8217;t seem like magic, then I don&#8217;t know what does.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/11/3d-printing-healthcare-magic.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The miracle of a thumbnail image from Mars</title>
		<link>http://radar.oreilly.com/2012/08/the-miracle-of-a-thumbnail-image-from-mars.html</link>
		<comments>http://radar.oreilly.com/2012/08/the-miracle-of-a-thumbnail-image-from-mars.html#comments</comments>
		<pubDate>Mon, 06 Aug 2012 19:03:30 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[Curiosity]]></category>
		<category><![CDATA[Curiosity rover]]></category>
		<category><![CDATA[digital]]></category>
		<category><![CDATA[image]]></category>
		<category><![CDATA[Mars]]></category>
		<category><![CDATA[Mars landing]]></category>
		<category><![CDATA[nasa]]></category>

		<guid isPermaLink="false">http://radar.oreilly.com/?p=50406</guid>
		<description><![CDATA[Last night, I stayed up late to watch the NASA livestream of the Curiosity rover landing. It seems to have been an unmitigated success: each step of the entry and landing process, even that crazy sky-crane maneuver, was performed flawlessly. &#8230; ]]></description>
				<content:encoded><![CDATA[<p>Last night, I stayed up late to watch the <a href="http://www.nasa.gov/mission_pages/mars/main/index.html">NASA livestream of the Curiosity rover landing</a>. It seems to have been an unmitigated success: each step of the entry and landing process, even that crazy sky-crane maneuver, was performed flawlessly.</p>
<p>As <a href="http://twitter.com/travisbeacham">Travis Beacham</a> put it on Twitter:</p>
<blockquote class="twitter-tweet tw-align-center"><p>A jet-fired hover crane just lowered a nuclear robot bigger than my car onto Mars. Then it emailed us pics, from the other side of the sun.</p>
<p>&mdash; Travis Beacham (@travisbeacham) <a href="https://twitter.com/travisbeacham/status/232355547071016960" data-datetime="2012-08-06T06:00:54+00:00">August 6, 2012</a></p></blockquote>
<p><script src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>Although there were <a href="http://www.nasa.gov/images/content/673507main_cheering-full_full.jpg">tearful hugs and high-fives</a> and all manner of cheering when &#8220;Touchdown!&#8221; was called, the wonderment built to a real climax when the <a href="http://www.nasa.gov/images/content/673528main_PIA15971_full.jpg">first thumbnail image came through</a>. It was small, in black and white, and showed the Martian horizon in the background, with the wheel of the rover in the foreground.</p>
<p>Shortly thereafter, a slightly larger version was displayed: still black and white, but with enough resolution to show dust on the glass. A second one followed a few minutes later, showing the <a href="http://www.nasa.gov/images/content/673517main_PIA15970-43_full.jpg">rover&#8217;s shadow</a> on the ground. Cue the &#8220;pics or it didn&#8217;t happen&#8221; jokes, as well as the rapid proliferation of Photoshopped <a href="https://twitter.com/Oatmeal/status/232352317603713024">spoofs</a>.</p>
<p align="center"><a href="http://www.nasa.gov/mission_pages/msl/multimedia/gallery-indexEvents.html"><img src="http://s.radar.oreilly.com/wp-files/2/2012/08/0812-curiosity-first-image.jpg" border="0" alt="Image from the Curiosity rover on Mars" width="265" style="margin-bottom: 15px;" /></a><br /><em>One of the <a href="http://www.nasa.gov/mission_pages/msl/multimedia/gallery-indexEvents.html">first images</a> from the Curiosity rover.</em></p>
<hr />
<p>In our micro-culture of the moment, obsessed with photo sharing and images, this tiny thumbnail still seemed like a miracle (albeit a required one). A picture really is worth a whole lot of words. But have you ever stopped to think about what it takes to plan for that from Mars?</p>
<p><span id="more-50406"></span>
<p>We take for granted being able to snap a great-looking picture and send it wirelessly to almost anywhere we want with the tap of a few icons, but transmitting images back from another planet is a complicated process.</p>
<p>I couldn&#8217;t help but think about the images that came back from the <a href="http://en.wikipedia.org/wiki/Phoenix_%28spacecraft%29">Phoenix lander</a> in 2008, and the excellent chapter <a href="http://www.oreillynet.com/pub/au/4547">J.M. Hughes</a>, principle software engineer for the imaging software on Phoenix, wrote in <a href="http://shop.oreilly.com/product/9780596157128.do?intcmp=il-strata-books-beautiful-data-curiosity-rover"><em>Beautiful Data</em></a>:</p>
<p>
<blockquote>The challenge was to devise a way to download the image data from each of the cameras, store the data in a pre-allocated memory location, process the data to remove known pixel defects, crop and/or scale the images, perform any commanded compression, and then slice-and-dice it all up into packets for hand-off to the main computer&#8217;s downlink manager task for transmission back to Earth.</p></blockquote>
<p>And all of this must be done carefully, sparingly, in order to conserve resources. As Hughes put it, &#8220;A spacecraft is an exercise in applied minimalism: just enough to do the job and no more.&#8221;</p>
<p>In honor of the Curiosity&#8217;s inspiring success, we&#8217;re making Hughes&#8217; chapter available <a href="http://cdn.oreilly.com/radar/2012/08/Beautiful_Data_Chapter3.pdf">here</a>. Reading about some of the design trade-offs required in building and successfully deploying the imaging software on a Mars spacecraft makes Curiosity&#8217;s achievement all the more amazing.</p>
<p>Images from the Curiosity rover can be found <a href="http://mars.jpl.nasa.gov/msl/multimedia/raw/">here</a>. </p>
<p><em><a href="http://www.nasa.gov/images/content/673528main_PIA15971_full.jpg">Curiosity image: NASA/JPL-Caltech</a></em></p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="http://shop.oreilly.com/product/9780596157128.do?intcmp=il-strata-books-beautiful-data-curiosity-rover"><img style="float: left;border: none;padding-right: 10px" src="http://cdn.oreilly.com/radar/images/posts/0812-beautiful-data-cover.png" width="148" /></a><a href="http://shop.oreilly.com/product/9780596157128.do?intcmp=il-strata-books-beautiful-data-curiosity-rover"><strong>Beautiful Data</strong></a> &mdash;  Learn from the best data practitioners in the field just how wide-ranging &mdash; and beautiful &mdash; working with data can be. Join 39 contributors as they explain how they developed simple and elegant solutions on projects ranging from a Mars lander to a Radiohead video.</p>
<p> <a href="http://shop.oreilly.com/product/9780596157128.do?intcmp=il-strata-books-beautiful-data-curiosity-rover"><strong>Save 50% on the ebook edition with the code PBDMARS</strong></a>.
</div>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/08/the-miracle-of-a-thumbnail-image-from-mars.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Why data visualization matters</title>
		<link>http://radar.oreilly.com/2012/02/why-data-visualization-matters.html</link>
		<comments>http://radar.oreilly.com/2012/02/why-data-visualization-matters.html#comments</comments>
		<pubDate>Wed, 15 Feb 2012 17:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@editpick]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[business intelligence]]></category>
		<category><![CDATA[data decisions]]></category>
		<category><![CDATA[data predictions]]></category>
		<category><![CDATA[data product]]></category>
		<category><![CDATA[data tools]]></category>
		<category><![CDATA[Planning for Big Data]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2012/02/why-data-visualization-matters.html</guid>
		<description><![CDATA[Effective data visualizations go beyond aesthetics; they also allow organizations to make quick and correct decisions from massive amounts of information.  ]]></description>
				<content:encoded><![CDATA[<p><img src="http://s.radar.oreilly.com/wp-files/2/2012/02/0212-viz.png" border="0" alt="Visualization example" style="float: right;margin: 3px 0 10px 10px" width="300" />Let&#8217;s say you need to understand thousands or even millions of rows of data, and you have a short time to do it in.  The data may come from your team, in which case perhaps you&#8217;re already familiar with what it&#8217;s measuring and what the results are likely to be.  Or it may come from another team, or maybe several teams at once, and be completely unfamiliar.  Either way, the reason you&#8217;re looking at it is that you have a decision to make, and you want to be informed by the data before making it.  Something probably hangs in the balance: a customer, a product, or a profit.</p>
<p>How are you going to make sense of all that information efficiently so you can make a good decision? Data visualization is an important answer to that question.</p>
<p>However, not all visualizations are actually that helpful. You may be all too familiar with lifeless bar graphs, or line graphs made with software defaults and couched in a slideshow presentation or lengthy document.  They can be at best confusing, and at worst misleading.  But the good ones are an absolute revelation.</p>
<p>The best data visualizations are ones that <em>expose something new</em> about the underlying patterns and relationships contained within the data.  Understanding those relationships &mdash; and being able to observe them &mdash; is key to good decision making.  The <a href="http://en.wikipedia.org/wiki/File:Periodic_table.svg">Periodic Table</a> is a classic testament to the potential of visualization to reveal hidden relationships in even small datasets. One look at the table, and chemists and middle school students alike grasp the way atoms arrange themselves in groups: alkali metals, noble gasses, halogens.</p>
<p>If visualization done right can reveal so much in even a small dataset like this, imagine what it can reveal within terabytes or petabytes of information.</p>
<h2 id="types">Types of visualization</h2>
<p>It&#8217;s important to point out that not all data visualization is created equal.  Just as we have paints and pencils and chalk and film to help us capture the world in different ways, with different emphases and for different purposes, there are multiple ways in which to depict the same dataset.</p>
<p>Or, to put it another way, think of visualization as a new set of languages you can use to communicate. Just as French and Russian and Japanese are all ways of encoding ideas so that those ideas can be transported from one person&#8217;s mind to another, and decoded again &mdash; and just as certain languages are more conducive to certain ideas &mdash; so the various kinds of data visualization are a kind of <em>bidirectional encoding</em> that lets ideas and information be transported from the database into your brain.</p>
<h2 id="explaining-exploring">Explaining and exploring</h2>
<p>An important distinction lies between visualization for <em>exploring</em> and visualization for <em>explaining</em>. A third category, <em>visual art</em>, comprises images that encode data but cannot easily be decoded back to the original meaning by a viewer. This kind of visualization can be beautiful, but it is not helpful in making decisions.</p>
<p>Visualization for exploring can be imprecise. It&#8217;s useful when you&#8217;re not exactly sure what the data has to tell you and you&#8217;re trying to get a sense of the relationships and patterns contained within it for the first time.  It may take a while to figure out how to approach or clean the data, and which dimensions to include. Therefore, visualization for exploring is best done in such a way that it can be iterated quickly and experimented upon, so that you can find the signal within the noise.  Software and automation are your friends here.</p>
<p>Visualization for explaining is best when it is cleanest. Here, the ability to pare down the information to its simplest form &mdash; to strip away the noise entirely &mdash; will increase the efficiency with which a decision maker can understand it.  This is the approach to take once you understand what the data is telling you, and you want to communicate that to someone else. This is the kind of visualization you <em>should</em> be finding in those presentations and sales reports.</p>
<p>Visualization for explaining also includes infographics and other categories of hand-drawn or custom-made images. Automated tools can be used, but one size does not fit all.</p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px;clear: both"><a href="http://www.microsoft.com/sql"><img style="float: left;border: none;padding-right: 10px" src="http://s.radar.oreilly.com/wp-files/2/2011/12/sponsor-ms-sql-server.png" /></a><a href="http://www.microsoft.com/sql"><strong>Microsoft SQL Server</strong></a> is a comprehensive information platform offering enterprise-ready technologies and tools that help businesses derive maximum value from information at the lowest TCO. SQL Server 2012 launches next year, offering a cloud-ready information platform delivering mission-critical confidence, breakthrough insight, and cloud on your terms; find out more at <a href="http://www.microsoft.com/sql">www.microsoft.com/sql</a>.</div>
<h2 id="customers">Your customers make decisions, too</h2>
<p>While data visualization is a powerful tool for helping you and others within your organization make better decisions, it&#8217;s important to remember that, in the meantime, your customers are trying to decide between you and your competitors. Many kinds of data visualization, from complex interactive or animated graphs to brightly-colored infographics, can help your customers explore and your customer service folks explain. </p>
<p>That&#8217;s why all kinds of companies and organizations, from <a href="http://visualization.geblogs.com/">GE</a> to <a href="http://insights.truliablog.com/">Trulia</a> to <a href="http://svs.gsfc.nasa.gov/">NASA</a>, are beginning to invest significant resources in providing interactive visualizations to their customers and the public. This allows viewers to better understand the company&#8217;s business, and interact in a self-directed manner with the company&#8217;s expertise.</p>
<p>As <a href="http://radar.oreilly.com/2012/01/what-is-big-data.html">big data</a> becomes bigger, and more companies deal with complex datasets with dozens of variables, data visualization will become even more important. So far, the tide of popularity has risen more quickly than the tide of visual literacy, and mediocre efforts abound, in presentations and on the web.</p>
<p>But as visual literacy rises, thanks in no small part to impressive efforts in major media such as <a href="http://www.nytimes.com/pages/multimedia/index.html">The New York Times</a> and <a href="http://www.guardian.co.uk/data">The Guardian</a>, data visualization will increasingly become a language your customers and collaborators expect you to speak &mdash; and speak well.</p>
<h2 id="designer">Do yourself a favor and hire a designer</h2>
<p>It&#8217;s well worth investing in a talented in-house designer, or a team of designers. Visualization for explaining works best when someone who understands not only the data itself, but also the principles of design and visual communication, tailors the graph or chart to the message.</p>
<p class="image-box-580"><img src="http://s.radar.oreilly.com/2012/02/08/0212-translation-example.png" border="0" alt="Translation example" width="580" style="margin-bottom: 15px" /><br />Whether it&#8217;s text or visuals, important translations require more than basic tools.</p>
<p>To go back to the language analogy: <a href="http://translate.google.com/">Google Translate</a> is a powerful and useful tool for giving you the general idea of what a foreign text says. But it&#8217;s not perfect, and it often lacks nuance.  For getting the overall gist of things, it&#8217;s great.  But I wouldn&#8217;t use it to send a letter to a foreign ambassador.  For something so sensitive, and where precision counts, it&#8217;s worth hiring an experienced human translator.</p>
<p>Since data visualization is like a foreign language, in the same way, hire an experienced designer for important jobs where precision matters. If you&#8217;re making the kinds of decisions in which your customer, product, or profit hangs in the balance, you can&#8217;t afford to base those decisions on incomplete or misleading representations of the knowledge your company holds.</p>
<p>Your designer is your translator, and one of the most important links you and your customers have to your data.</p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://shop.oreilly.com/product/0636920000617.do?cmp=il-radar-books-why-visualization-matters">Beautiful Visualization: Looking at Data through the Eyes of Experts (book)</a>
<li> <a href="http://radar.oreilly.com/2011/05/visualization-intent.html">When judging visualizations, intent matters</a></li>
<li><a href="http://radar.oreilly.com/2010/07/redesigning-the-new-york-city.html">Redesigning the New York City subway map</a></li>
<li> <a href="http://radar.oreilly.com/tag/visualization-deconstructed">Visualization Deconstructed series</a></li>
<li> <a href="http://radar.oreilly.com/tag/visualization-of-the-week">Visualization of the Week series</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2012/02/why-data-visualization-matters.html/feed</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Data and the human-machine connection</title>
		<link>http://radar.oreilly.com/2011/08/data-human-machine-analysis.html</link>
		<comments>http://radar.oreilly.com/2011/08/data-human-machine-analysis.html#comments</comments>
		<pubDate>Tue, 02 Aug 2011 13:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[@top]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data products]]></category>
		<category><![CDATA[data science]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/08/data-human-machine-analysis.html</guid>
		<description><![CDATA[Managing data and extracting meaning require new approaches, new education, and even a new language. Opera Solutions CEO Arnab Gupta discusses each of these areas in the following interview. ]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.operasolutions.com/profile_arnab_gupta.html">Arnab Gupta</a> is the CEO of <a href="http://www.operasolutions.com/index.html">Opera Solutions</a>, an international company offering big data analytics services. I had the chance to chat with him recently about the massive task of managing big data and how humans and machines intersect. Our interview follows.</p>
<hr />
<h2>Tell me a bit about your approach to big data analytics.</h2>
<p><img src="http://s.radar.oreilly.com/2011/08/01/0811-arnab-gupta.jpg" border="0" alt="Arnab Gupta" style="float: right;margin: 3px 0 10px 10px" width="75" /><strong>Arnab Gupta:</strong> Our company is a science-oriented company, and the core belief is that behavior &#8212; human or otherwise &#8212; can be mathematically expressed. Yes, people make irrational value judgments, but they are driven by common motivation factors, and the math expresses that.</p>
<p>I look at the so-called &#8220;big data phenomenon&#8221; as the instantiation of human experience. Previously, we could not quantitatively measure human experience, because the data wasn&#8217;t being captured. But Twitter recently announced that they now serve <a href="http://www.pcworld.com/article/235846/as_twitter_turns_5_it_delivers_350_billion_tweets_each_day.html">350 billion tweets a day</a>. What we say and what we do has a physical manifestation now. Once there is a physical manifestation of a phenomenon, then it can be mathematically expressed. And if you can express it, then you can shape business ideas around it, whether that&#8217;s in government or health care or business.</p>
<h2>How do you handle rapidly increasing amounts of data?</h2>
<p><strong>Arnab Gupta:</strong> It&#8217;s an impossible battle when you think about it. The amount of data is going to grow exponentially every day, ever week, every year, so capturing it all can&#8217;t be done. In the economic ecosystem there is extraordinary waste. Companies spend vast amounts of money, and the ratio of investment to insight is growing, with much more investment for similar levels of insight. This method just mathematically cannot work.</p>
<p>So, we don&#8217;t look for data, we look for signal. What we&#8217;ve said is that the shortcut is a priori identifying the signals to know where the fish are swimming, instead of trying to dam the water to find out which fish are in it. We focus on the flow, not a static data capture.</p>
<div style="float: left;border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px"><a href="http://strataconf.com/public/content/landing?_discount=strata&amp;cmp=il-radar-st11-gupta-interview"><img style="float: left;border: none;padding-right: 10px" src="http://s.radar.oreilly.com/strata-ny-stn11rad.png" /></a><a href="http://strataconf.com/public/content/landing?_discount=strata&amp;cmp=il-radar-st11-gupta-interview"><strong>Strata Conference New York 2011</strong></a>, being held Sept. 22-23, covers the latest and best tools and technologies for data science &#8212; from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively.</p>
<p><a href="http://strataconf.com/public/content/landing?_discount=strata&amp;cmp=il-radar-st11-gupta-interview"><strong>Save 30% on registration with the code STN11RAD</strong></a></div>
<h2>What role does visualization play in the search for signal?</h2>
<p><strong>Arnab Gupta:</strong> Visualization is essential. People dumb it down sometimes by calling it &#8220;UI&#8221; and &#8220;dashboards,&#8221; and they don&#8217;t apply science to the question of how people perceive. We need understanding that feeds into the left brain through the right brain via visual metaphor. At Opera Solutions, we are increasingly trying to figure out the ways in which the mind understands and transforms the visualization of algorithms and data into insights.</p>
<h2>If understanding is a priority, then which do you prefer: a black-box model with better predictability, or a transparent model that may be less accurate?</h2>
<p><strong>Arnab Gupta:</strong> People bifurcate, and think in terms of black-box machines vs. the human mind. But the question is whether you can use machine learning to feed human insight. The power lies in expressing the black box and making it transparent. You do this by stress testing it. For example, if you were looking at a model for mortgage defaults, you would say, &#8220;What happens if home prices went down by X percent, or interest rates go up by X percent?&#8221; You make your own heuristics, so that when you make a bet you understand exactly how the machine is informing your bet.</p>
<p>Humans can do analysis very well, but the machine does it <em>consistently</em> well; it doesn&#8217;t make mistakes. What the machine lacks is the ability to consider orthogonal factors, and the creativity to consider what <em>could</em> be. The human mind fills in those gaps and enhances the power of the machine&#8217;s solution.</p>
<h2>So you advocate a partnership between the model and the data scientist?</h2>
<p><strong>Arnab Gupta:</strong> We often create false dichotomies for ourselves, but the truth is it&#8217;s never been man vs. machine; it has always been man <em>plus</em> machine. Increasingly, I think it&#8217;s an article of faith that the machine beats the human in most large-scale problems, even chess. But though the predictive power of machines may be better on a large-scale basis, if the human mind is trained to use it powerfully, the possibilities are limitless. In the recent Jeopardy showdown with <a href="http://www-03.ibm.com/innovation/us/watson/index.html">IBM&#8217;s Watson</a>, I would have had a three-way competition with Watson, a Jeopardy champion, and a <em>combination</em> of the two. Then you would have seen where the future lies.</p>
<h2>Does this mean we need to change our approach to education, and train people to use machines differently?</h2>
<p><strong>Arnab Gupta:</strong> Absolutely. If you look back in time between now and the 1850s, everything in the world has changed except the classroom. But I think we are dealing with a phase-shift occurring. Like most things, the inertia of power is very hard to shift. Change can take a long time and there will be a lot of debris in the process.</p>
<p>One major hurdle is that the language of machine-plus-human interaction has not yet begun to be developed. It&#8217;s partly a silent language, with data visualization as a significant key. The trouble is that language is so powerful that the left brain easily starts dominating, but really almost all of our critical inputs come from non-verbal signals. We have no way of creating a new form of language to describe these things yet. We are at the beginning of trying to develop this.</p>
<p>Another open question is: What&#8217;s the skill set and the capabilities necessary for this? At Opera we have focused on the ability to teach machines how to learn. We have 150-160 people working in that area, which is probably the largest private concentration in that area outside IBM and Google. One of the reasons we are hiring all these scientists is to try to innovate at the level of core competencies and the science of comprehension.</p>
<p>The business outcome of that is simply practical. At the end of the day, much of what we do is prosaic; it makes money or it doesn&#8217;t make money.  It&#8217;s a business. But the philosophical fountain from which we drink needs to be a deep one.</p>
<p><em>Associated photo on home and category pages: <a href="http://www.flickr.com/photos/pdenker/74684051/" title="prd brain scan by Patrick Denker, on Flickr">prd brain scan by Patrick Denker, on Flickr</a></em></p>
<p></p>
<p><strong>Related:</strong></p>
<ul>
<li> <a href="http://radar.oreilly.com/2010/06/what-is-data-science.html">What is data science?</a></li>
<li> <a href="http://radar.oreilly.com/2011/03/dashboard-social-business.html">Dashboards evolve to meet social and business needs</a></li>
<li> <a href="http://www.youtube.com/oreillymedia#p/c/EF277D84FE2A28D5/34/eWdmrLaZ2TM">Video: Roger Ehrenberg on data analysis and user experience</a></li>
<li> <a href="http://radar.oreilly.com/2010/08/thousands-of-workers-are-stand.html">Thousands of workers are standing by</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/08/data-human-machine-analysis.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Dusting for device fingerprints</title>
		<link>http://radar.oreilly.com/2011/03/device-identification-bluecava.html</link>
		<comments>http://radar.oreilly.com/2011/03/device-identification-bluecava.html#comments</comments>
		<pubDate>Tue, 01 Mar 2011 14:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[advertising]]></category>
		<category><![CDATA[cookies]]></category>
		<category><![CDATA[device identification]]></category>
		<category><![CDATA[fraud]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[reputation]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/03/device-identification-bluecava.html</guid>
		<description><![CDATA[BlueCava lets businesses identify devices that are coming to their websites. In this interview, BlueCava CEO David Norris discusses fraud prevention, privacy, and the state of reputation technology. ]]></description>
				<content:encoded><![CDATA[<p>In a <a href="http://radar.oreilly.com/2010/12/strata-week-replaced-by-robots.html">previous Strata Week post</a>, I wrote about <a href="http://www.bluecava.com/">BlueCava</a>, an Orange County, Calif.-based company that has patented a way of identifying the unique fingerprint of any electronic device connected to the Internet. Last October, they closed a <a href="http://www.bluecava.com/tag/series-a/">$5 million round</a> of series-A funding led by <a href="http://en.wikipedia.org/wiki/Mark_Cuban">Mark Cuban</a>.</p>
<p>Recently, BlueCava announced the formation of an <a href="http://www.bluecava.com/about/advisory-board/">advisory board</a>, which includes executives from Facebook, MasterCard, HP, FirstData, Bill to Mobile, and Merchant Warehouse. I caught up with CEO David L. Norris to discuss device identification, reputation technology, online fraud, and consumer privacy.</p>
<p>Our interview follows.</p>
<hr />
<h2>Tell me a bit more about what BlueCava does and how it works.</h2>
</p>
<p><img alt="David Norris" src="http://s.radar.oreilly.com/assets_c/2011/02/David Norris - Headshot-thumb-200x168.jpg" width="200" height="168" style="float: right;margin: 3px 0 10px 10px" /><strong>David Norris:</strong> BlueCava provides a platform that enables businesses to identify devices that are coming to their website. First, we identify the device and then we provide additional information about the device that would be useful to our customers in making decisions about how to interact with that device. One application is finding fraud. Another interesting area is social networking sites: a site may choose not to allow certain users to participate if they have a history of trollish behavior.</p>
<p>As we identify devices, we build information about each device. One of the things we can tell about a device is if it&#8217;s a shared computer being used by multiples users. We can also determine the specific level of use &#8212; whether it&#8217;s a household computer in the kitchen with a handful of users or an Internet cafe computer with hundreds of users.</p>
</p>
<h2>It sounds like BlueCava is largely used to identify negative behavior. Can the technology also be used to identify devices or users with a positive history?</h2>
</p>
<p><strong>David Norris:</strong> In some ways it&#8217;s better to identify a good device rather than the bad ones &#8212; it&#8217;s much harder to mimic or fake a &#8220;clean&#8221; machine that has no history. So we&#8217;ve taken on the task of identifying a broad set of devices. This year, we&#8217;ll identify more than 1 billion devices.</p>
<p>From there, among the partnerships we&#8217;ve signed up, we&#8217;re going to assign direct financial benefits to those with a positive history, such as discounts and rewards. We&#8217;ll be announcing further details soon. For site managers, it you have a historical reputation that&#8217;s good, there&#8217;s an opportunity to reduce some of the costs associated with interacting with you, like performing extensive background checks. So they can afford to pass some of those savings on to users who merit it.</p>
</p>
<h2>What about the privacy issues associated with device identification?</h2>
</p>
<p><strong>David Norris:</strong> We do not collect any personal information. We don&#8217;t collect Social Security numbers or email addresses. We identify devices and we characterize a device&#8217;s behavior. For devices with GPS receivers built in, we collect information at a ZIP-code level, not a granular level. That would be a violation of privacy.</p>
<p>We&#8217;ve also implemented what the <a href="http://www.ftc.gov/opa/2010/12/dnttestimony.shtm">FTC</a> is calling &#8220;<a href="http://33bits.org/2010/09/20/do-not-track-explained/">do not track</a>,&#8221; so users can either opt out or set their preferences when it comes to online marketing.</p>
<p>There&#8217;s a difference between being identified and being tracked: if you turn tracking off, we can still identify a device but we don&#8217;t keep track of which websites it&#8217;s been to.</p>
</p>
<h2>Since no system is perfect, what are the remedies available to users whose devices or histories are misidentified?</h2>
</p>
<p><strong>David Norris:</strong> If a question comes up about a particular device, the user can go to the merchant or site owner, who can then escalate the issue in a review queue. It becomes a human process at that point.</p>
</p>
<h2>So BlueCava is not making direct recommendations about user accounts?</h2>
</p>
<p><strong>David Norris:</strong> We&#8217;re very careful not to position ourselves as a fraud solution. We are a tech company that can be part of an existing fraud solution, but device identification is only part of the story. We&#8217;re gathering information that&#8217;s already available and has been used for years by other companies. What we&#8217;re doing differently is using it in a unique way.</p>
<p>Imagine that you&#8217;re a store owner, and one day someone walks into your store and then walks out. The next day, they walk in again, and your recognize them. You&#8217;d do that naturally based on hair color, eye color, the shape of their face, etc. And you could recognize them even if they were wearing a different shirt, because you know that their shirt can change but their face won&#8217;t. We do the same thing with devices. Our technology is adaptive, and allows for change to occur. But it&#8217;s up to each individual client how to use that information.</p>
</p>
<h2>Some users may find this kind of device identification intimidating because it seems like &#8220;magical spying.&#8221; What would you say to them?</h2>
</p>
<p><strong>David Norris:</strong> Cookies used to seem magical too. But then people got used to the idea of them.</p>
<p>Our technology, I believe, will replace cookies eventually. It just observes your machine instead of reaching into it and dropping something there. </p>
<p>Device identification is an improvement over cookies in part because if you choose to opt out, you&#8217;re opted out and that&#8217;s it. If you opt out using cookies, the system actually drops an opt-out cookie on your machine &#8212; if you <a href="http://googlepublicpolicy.blogspot.com/2011/01/keep-your-opt-outs.html">clear your cookies</a>, then you&#8217;re opted back in! Also, you have to opt out on multiple browsers. From a device identification standpoint, it&#8217;s much cleaner: you opt out once and it&#8217;s done.</p>
<p><em>This interview was edited and condensed.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/03/device-identification-bluecava.html/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data markets aren&apos;t coming. They&apos;re already here</title>
		<link>http://radar.oreilly.com/2011/01/data-markets-resellers-gnip.html</link>
		<comments>http://radar.oreilly.com/2011/01/data-markets-resellers-gnip.html#comments</comments>
		<pubDate>Wed, 26 Jan 2011 14:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[black market]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data markets]]></category>
		<category><![CDATA[gnip]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/01/data-markets-resellers-gnip.html</guid>
		<description><![CDATA[Gnip cofounder and CEO Jud Valeski discusses data markets (and black markets), social media, and real-time data&apos;s impact on customer relations. ]]></description>
				<content:encoded><![CDATA[<p>Jud Valeski (<a href="http://twitter.com/#!/jvaleski">@jvaleski</a>) is cofounder and CEO of <a href="http://gnip.com/">Gnip</a>, a social media data provider that aggregates feeds from sites like <a href="http://twitter.com/">Twitter</a>, <a href="http://www.facebook.com/">Facebook</a>, <a href="http://www.flickr.com/">Flickr</a>, <a href="http://www.delicious.com/">delicious</a>, and others into one API.</p>
<p>Jud will be speaking at <a href="http://strataconf.com/strata2011/?cmp=il-radar-st11-valeski">Strata</a> next week on a panel titled &#8220;<a href="http://strataconf.com/strata2011/public/schedule/detail/17602?cmp=il-radar-st11-valeski">What&#8217;s Mine is Yours: the Ethics of Big Data Ownership</a>.&#8221;</p>
<p>If you&#8217;re attending Strata, you can also find out more about growing business of data marketplaces at a <a href="http://strataconf.com/strata2011/public/schedule/detail/17604">&#8220;Data Marketplaces&#8221; panel</a> with <a href="http://strataconf.com/strata2011/public/schedule/speaker/26?cmp=il-radar-st11-valeski">Ian White</a> of <a href="http://urbanmapping.com/">Urban Mapping</a>, <a href="http://strataconf.com/strata2011/public/schedule/speaker/104234?cmp=il-radar-st11-valeski">Peter Marney</a> of <a href="http://thomsonreuters.com/">Thomson Reuters</a>, <a href="http://strataconf.com/strata2011/public/schedule/speaker/50595?cmp=il-radar-st11-valeski">Moe Khosravy</a> of <a href="https://datamarket.azure.com/">Microsoft</a>, and <a href="http://strataconf.com/strata2011/public/schedule/speaker/107129?cmp=il-radar-st11-valeski">Dennis Yang</a> of <a href="http://infochimps.com/">Infochimps</a>.</p>
<p>My interview with Jud follows.</p>
<hr />
<h2>Why is social media data important? What can we do with it or learn from it?</h2>
</p>
<p><img src="http://cdn.oreilly.com/radar/images/people/photo_judv_m.jpg" border="0" alt="Jud Valeski" style="float: right;margin: 3px 0 10px 10px" /><strong>Jud Valeski:</strong> Social media today is the first time a reasonably large population has communicated digitally in relative public. The ability to programmatically analyze collective conversation has never really existed. Being able to analyze the collective human consciousness has been the dream of researchers and analysts since day one.</p>
<p>The data itself is important because it can be analyzed to assist in disaster detection and relief. It can be analyzed for profit in an industry that has always struggled to pinpoint how and where to spend money. It can be analyzed to determine financial market viability (stock trading, for example). It can be analyzed to understand community sentiment, which has political ramifications; we all want our voices heard in order to shape public policy.</p>
</p>
<h2>What are some of the most common or surprising queries run through Gnip?</h2>
</p>
<p><strong>Jud Valeski:</strong> We don&#8217;t look at the queries our customers use. One pattern we have seen, however, is that there are some people who try to use the software to siphon as much data as possible out of a given publisher. &#8220;More data, more data, more data.&#8221; We hear that all the time. But how our customers configure the Gnip software is up to them.</p>
<div style="border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px"><a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-valeski"><img style="float: left;border: none;padding-right: 10px" src="http://s.radar.oreilly.com/strata11-promo-radar.png" /></a><a href="http://strataconf.com/?cmp=il-radar-st11-valeski"><strong>Strata: Making Data Work</strong></a>, being held Feb. 1-3, 2011 in Santa Clara, Calif.,  will focus on the business and practice of data. The conference will provide three days of training, breakout sessions, and plenary discussions &#8212; along with an Executive Summit, a Sponsor Pavilion, and other events showcasing the new data ecosystem.</p>
<p><a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-valeski"><strong>Save 30% off registration with the code STR11RAD</strong></a></div>
</p>
<h2>With Gnip, customers can choose the data sources they want not just by site but also by category within the site. Can you tell me more about the options for Twitter, which include <a href="http://gnip.com/twitter/decahose">Decahose</a>, <a href="http://gnip.com/twitter/halfhose">Halfhose</a>, and <a href="http://gnip.com/twitter/spritzer">Spritzer</a>?</h2>
</p>
<p><strong>Jud Valeski:</strong> We tend to categorize social media sources into three buckets: Volume, Coverage, or Both. Volume streams provide a consumer with a sampled rate of volume (Decahose is 10%, for example, while a full firehose is 100% of some service&#8217;s activities). Statisticians and analysts like the Volume stuff.</p>
<p>Coverage streams exist to provide full coverage of a certain set of things (e.g., keywords, or the User Mention Stream for Twitter). Advertisers like Coverage streams because their interests are very targeted. There are some products that fall into both categories, but Volume and Coverage tend to describe the overall view.</p>
<p>For Twitter in particular, we use their algorithm <a href="http://dev.twitter.com/pages/streaming_api_concepts#sampling">as described on their dev pages</a>, adjusted for each particular volume rate desired.</p>
</p>
<h2>Gnip is currently the only licensed reseller of the full Twitter firehose. Are there other partnerships coming up?</h2>
</p>
<p><strong>Jud Valeski:</strong> &#8220;Currently&#8221; is the operative word here. While we&#8217;re enjoying the implied exclusivity of the current conditions, we fully expect Twitter to grow its <a href="http://en.wikipedia.org/wiki/Value-added_reseller">VAR</a> tier to ensure a more competitive marketplace.</p>
<p>From my perspective, Twitter enabling VARs allows them to focus on what is near and dear to their hearts &#8212; developer use cases, promoted Tweets, end users, and the display ecosystem &#8212; while enabling firms focused on the data-delivery business to distribute underlying data for non-display use. Gnip provides stream enrichments for all of the data that flows through our software. Those enrichments include format and protocol normalization, as well as stream augmentation features such as global URL unwinding. Those value-adds make social media API integration and data leverage much easier than doing a bunch of one-off integrations yourself.</p>
<p>We&#8217;re certainly working on other partnerships of this level of significance, but we have nothing to announce at this time.</p>
</p>
<h2>What do you wish more people understood about data markets and/or the way large datasets can be used?</h2>
</p>
<p><strong>Jud Valeski:</strong> First, data is not free, and there&#8217;s always someone out there that wants to buy it. As an end-user, educate yourself with how the content you create using someone else&#8217;s service could ultimately be used by the service-provider.</p>
<p>Second, <a href="http://radar.oreilly.com/2010/10/the-black-market-for-data.html">black markets</a> are a real problem, and just because &#8220;everyone else is doing it&#8221; doesn&#8217;t mean it&#8217;s okay. As an example, <a href="http://en.wikipedia.org/wiki/Botnet">botnet</a>-like distributed IP address polling infrastructure is commonly used to extract more data from a publisher&#8217;s service than their API usage terms allow. While perhaps fun to build and run (sometimes), these approaches clearly result in aggregated pools of publisher data that the publisher never intended to promote. Once collected, the aggregated pools of data are sold to data-hungry analytics firms. This results in end-user frustration, in that the content they produced was used in a manner that flagrantly violated the terms under which they signed up. These databases are frequently called out as infringing on privacy.</p>
<p>Everyone loves a good Robin Hood story, and that&#8217;s how I&#8217;d characterize the overall state of data collection today.</p>
</p>
<h2>How has real-time data changed the field of customer relationship management (CRM)?</h2>
</p>
<p><strong>Jud Valeski:</strong> <a href="http://en.wikipedia.org/wiki/Customer_relationship_management">CRM</a> firms have a new level of awareness. They no longer rely exclusively on dated user studies. A customer service rep may know about your social life through their dashboard the moment you are connected to them over the phone.</p>
<p>I ultimately see the power of understanding collective consciousness in responding to customer service issues. We haven&#8217;t even scratched the surface here. Imagine if Company X reached out to you directly every time you had a problem with their product or service. Proactivity can pay huge dividends. Companies haven&#8217;t tapped even 10% of the potential here, and part of that is because they&#8217;re not spending enough money in the area yet.</p>
<p>Today, &#8220;social&#8221; is a checkbox that CRM tools attempt to check off just to keep the boss happy. Tomorrow, social data and metaphors will define the tools outright.</p>
</p>
<h2>Have you learned anything as a social media user yourself from working on Gnip? Is there anything social media users should be more aware of?</h2>
</p>
<p><strong>Jud Valeski:</strong> Read the terms of service for social media services you&#8217;re using before you complain about privacy policies or how and where your data is being used. Unless you are on a private network, your data is treated as public for all to use, see, sell, or buy. Don&#8217;t kid yourself. Of course, this brings us all the way back around to black markets. Black markets &#8212; and publishers&#8217; generally lackadaisical response to them &#8212; cloud these waters.</p>
<hr />
<p><em>If you can&#8217;t make it to Strata, you can learn more about the architectural challenges of distributing social and location data across the web in real time, and how Gnip has evolved to address those challenges, in Jud&#8217;s contribution to &#8220;<a href="http://oreilly.com/catalog/9780596157128">Beautiful Data</em></a>.&#8221;</em></p>
<p></p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/01/data-markets-resellers-gnip.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Strata Week: Data centers</title>
		<link>http://radar.oreilly.com/2011/01/strata-week-data-centers.html</link>
		<comments>http://radar.oreilly.com/2011/01/strata-week-data-centers.html#comments</comments>
		<pubDate>Thu, 13 Jan 2011 14:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[air economization]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data center]]></category>
		<category><![CDATA[Gov 2.0]]></category>
		<category><![CDATA[modular]]></category>
		<category><![CDATA[strataconf]]></category>
		<category><![CDATA[strataweek]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2011/01/strata-week-data-centers.html</guid>
		<description><![CDATA[This week, we look at the problem of too much government data, and companies beginning to build air-economized data centers (some in barns!). Plus: a few suggestions for pre-Strata reading on big data. ]]></description>
				<content:encoded><![CDATA[<p>Here&#8217;s what caught my attention in the data world this week.</p>
</p>
<h2>Infopocalypse?</h2>
</p>
<p>As a former Bostonian, I well remember the <a href="http://en.wikipedia.org/wiki/Big_Dig">Big Dig</a>: a project to sink the city&#8217;s Central Artery underground and add tunnels and a bridge to relieve traffic congestion. The effort consumed (not unpredictably) several years and billions of dollars more than originally projected. And by the time the dust cleared, traffic had increased so much that congestion was just as bad, if not worse, than when the project began.</p>
<p>As with Massachusetts and vehicles, so with government and data. Fittingly, the Boston Phoenix this week <a href="http://thephoenix.com/boston/news/113481-infopocalypse-the-cost-of-too-much-data/#ixzz1ABqe8kAJ">published a look</a> at the challenge of government data.</p>
<blockquote><p>Digital storage is not a natural resource. The amount of information that government agencies may be required to keep &mdash; from tweets and emails to tax histories &mdash; is growing faster than the capacity for storage.</p>
</blockquote>
<p>While the Obama administration has made strides to address this situation by forming the Office of Government Information Services (<a href="http://www.archives.gov/ogis/">OGIS</a>) in 2009 and <a href="http://www.whitehouse.gov/the_press_office/President-Obama-Names-Vivek-Kundra-Chief-Information-Officer/">appointing Vivek Kundra</a> as Chief Information Officer, the need is so large as to remain overwhelming at the moment. From the same <a href="http://thephoenix.com/boston/news/113481-infopocalypse-the-cost-of-too-much-data/#ixzz1ABqe8kAJ">Phoenix article</a>:</p>
<blockquote><p>The United States Census Bureau alone maintains about 2560 terabytes of information &mdash; more data than is contained in all the academic libraries in America, and the equivalent of about 50 million four-door filing cabinets of text documents. In addition to the federal deluge, tens of thousands of municipal and state facilities maintain data ranging from driver&#8217;s-license pics to administrative e-mails &mdash; or at least they&#8217;re required to.</p>
</blockquote>
<p>More and more, huge storage requirements <a href="http://radar.oreilly.com/2010/07/ca-cio-teri-takai-on-governmen.html">meet staffing cuts and tight budgets</a> in a complicated showdown. Many municipal governments have IT staffs of one or two people, if they have IT staffs at all.</p>
<p>This, of course, leads to a lot of outsourcing and privatization, which come with personnel and expertise benefits, but also security drawbacks. Will government digitization and data transport us into the future, or become another big dig against a rising tide?</p>
<div style="border-top: thin gray solid;border-bottom: thin gray solid;padding: 20px;margin: 20px 2px"><a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-strataweek-011311"><img style="float: left;border: none;padding-right: 10px" src="http://s.radar.oreilly.com/strata11-promo-radar.png" /></a><a href="http://strataconf.com/?cmp=il-radar-st11-strataweek-011311"><strong>Strata: Making Data Work</strong></a>, being held Feb. 1-3, 2011 in Santa Clara, Calif.,  will focus on the business and practice of data. The conference will provide three days of training, breakout sessions, and plenary discussions &mdash; along with an Executive Summit, a Sponsor Pavilion, and other events showcasing the new data ecosystem.</p>
<p><a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-strataweek-011311"><strong>Save 30% off registration with the code STR11RAD</strong></a></div>
</p>
<h2>This data was raised in a barn</h2>
</p>
<p>Of course, privatization may save the day after all, if it can significantly lower the cost of data centers by saving power. That&#8217;s one goal of <a href="http://www.itnews.com.au/News/243260,microsoft-puts-data-centre-in-a-barn.aspx">Microsoft&#8217;s new data center</a> in Quincy, Wash.: it will use outside air for cooling (known as &#8220;air-side economizing&#8221;), and house the racks in a barn-like building that protects the servers from wind and rain but is otherwise &#8220;virtually transparent to ambient outdoor conditions.&#8221;</p>
<p>These new server farms will also make use of Microsoft&#8217;s IT Pre-Assembled Components (<a href="http://www.microsoft.com/showcase/en/us/details/84f44749-1343-4467-8012-9c70ef77981c">ITPACs</a>), which allow for flexibility and scaling, and will help keep costs down even further.</p>
<p>Microsoft, like <a href="http://www.datacenterknowledge.com/archives/2008/09/18/intel-servers-do-fine-with-outside-air/">Intel</a> in New Mexico and <a href="http://ae.redrocksdatacenter.com/">Red Rocks Data Center</a> and <a href="http://www.datacenterknowledge.com/archives/2009/01/26/suns-colorado-consolidation-saves-millions/">Sun Microsystems</a> in Colorado, began experimenting with air-side economizing in late 2008 and 2009.</p>
<div align="center">
<p class="image-box-480">
<img src="http://s.radar.oreilly.com/assets_c/2011/01/aireconomizer-thumb-486x415.gif" width="480" alt="aireconomizer.gif" style="margin-bottom: 15px" /><br />Air-side economiser of the kind Red Rocks Data Center was using in Colorado.</p>
</div>
<p>Intel went on to publish a <a href="http://www.intel.com/it/pdf/Reducing_Data_Center_Cost_with_an_Air_Economizer.pdf">white paper</a> on air economization as well as a <a href="http://video.intel.com/?fr_story=2d6e0fbbef76b72c6119cc7fe7889bba20cb5192&amp;rf=sitemap">Proof of Concept video</a> in which they report lowering power costs by nearly 74 percent. Red Rocks Data has since closed, but first <a href="http://ae.redrocksdatacenter.com/2010/02/update.html">reported</a> that during the coolest months of the year, &#8220;our savings are averaging about $1,600 a month on a $5,000 total bill.&#8221;</p>
<p>With or without a tractor shed, many more of these air-cooled data centers with a modular approach are likely to be built in the coming years. Microsoft expects to open others in Virginia and Iowa in 2011, and they likely will not be alone.</p>
<p>Maybe the White House should build itself a barn.</p>
</p>
<h2>Resource room</h2>
</p>
<p>If you&#8217;re headed to <a href="http://strataconf.com/strata2011?cmp=il-radar-st11-strataweek-011311">Strata</a> in a couple of weeks and find yourself in need of some anticipatory reading for the flight, download the recent Big Data reports from <a href="http://www.pwc.com/us/en/technology-forecast/2010/issue3/index.jhtml">PricewaterhouseCoopers</a> and <a href="http://www.nesta.org.uk/events/hot_topics/assets/features/big_data_report">NESTA</a> (the National Endowment for Science, Technology and the Arts in the UK). </p>
<p>The NESTA report lays out some of the key concepts and threads from its November 2010 event, &#8220;The Power and Possibilities of Big Data.&#8221; You can also watch <a href="http://www.nesta.org.uk/events/hot_topics/assets/events/silicon_valley_comes_to_nesta_the_power_and_possibilities_of_big_data">video</a> from the event, which brought together folks like <a href="http://www.brondmo.com/">Hans Peter Brøndmo</a> from Nokia, <a href="http://dawncapital.co.uk/welcome/our-team/the-dawn-team/haakon-overli.aspx">Haakon Overli</a> from Dawn Capital, Max Jolly from <a href="http://www.dunnhumby.com/">dunnhumby</a>, and <a href="http://www.execdigital.com/Growing-Google--Megan-Smith-on-the-work-culture-at-the-internet-giant_4194">Megan Smith</a> from Google.</p>
<p>The PwC issue is aimed at <a href="http://radar.oreilly.com/tag/cio">CIOs</a> and covers &#8220;the techniques behind low-cost distributed computing that have led companies to explore more of their data in new ways.&#8221; Several of these articles will be great background before heading off to Strata &mdash; <a href="http://strataconf.com/strata2011?cmp=il-radar-st11-strataweek-011311">hope to see you there</a>!</p>
<hr />
<em>Keep up with new developments in the data world with the <a href="http://feeds.feedburner.com/oreilly/radar/strataweek"><strong>Strata Week RSS feed</strong></a>.</em></p>
<p></p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2011/01/strata-week-data-centers.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Strata Week: Shop &apos;til you drop</title>
		<link>http://radar.oreilly.com/2010/12/strata-week-shop-til-you-drop.html</link>
		<comments>http://radar.oreilly.com/2010/12/strata-week-shop-til-you-drop.html#comments</comments>
		<pubDate>Thu, 16 Dec 2010 14:00:00 +0000</pubDate>
		<dc:creator>Julie Steele</dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[@home]]></category>
		<category><![CDATA[Azure]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data markets]]></category>
		<category><![CDATA[netflix]]></category>
		<category><![CDATA[new york times]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[stack overflow]]></category>
		<category><![CDATA[strataconf]]></category>
		<category><![CDATA[strataweek]]></category>
		<category><![CDATA[visualizations]]></category>

		<guid isPermaLink="false">http://blogs.oreilly.com/radar/2010/12/strata-week-shop-til-you-drop.html</guid>
		<description><![CDATA[In this edition of Strata Week: Stack Exchange takes their hardware and software in-house; Neflix explains their adoption of AWS and open source; the New York Times maps out survey and census data; and Infochimps acquires Data Marketplace. ]]></description>
				<content:encoded><![CDATA[<p>Need a break from the holiday madness? You&#8217;re not alone. Check out these items of interest from the land of data and see why even the big consumers face tough choices.</p>
</p>
<h2>Does this place accept returns?</h2>
</p>
<p>On Monday, <a href="http://stackoverflow.com/about">Stack Overflow</a> announced that they have moved the <a href="http://blog.stackoverflow.com/2010/06/introducing-stack-exchange-data-explorer/">Stack Exchange Data Explorer</a> (SEDE) off of the <a href="http://www.microsoft.com/windowsazure/windowsazure/">Windows Azure</a> platform and onto in-house hardware.</p>
<div align="center">
<p class="image-box-450">
<img alt="data-explorer-screenshot.png" src="http://s.radar.oreilly.com/assets_c/2010/12/data-explorer-screenshot-thumb-486x296.png" width="450" />
</p>
</div>
<p>SEDE is an open source, web-based tool for querying the monthly data dump of Creative Commons data from its four main Q&amp;A sites (<a href="http://stackoverflow.com/">Stack Overflow</a>, <a href="http://serverfault.com/">Server Fault</a>, <a href="http://superuser.com/">Super User</a>, and <a href="http://meta.stackoverflow.com/">Meta</a>) as well as other sites in the Stack Exchange family. The primary reason given (within <a href="http://blog.stackoverflow.com/2010/12/re-launching-stack-exchange-data-explorer/">a polite write-up</a> by <a href="http://www.codinghorror.com/blog/2004/02/about-me.html">Jeff Atwood</a> and SEDE lead <a href="http://stackoverflow.com/users/17174/sam-saffron">Sam Saffron</a>), was the desire to have fine-tuned control over the platform.</p>
<blockquote><p>When you are using a [Platform-as-a-Service] you are giving up a lot of control to the service provider. The service provider chooses which applications you can run and imposes a series of restrictions. &#8230; It was disorienting moving to a platform where we had no idea what kind of hardware was running our app. Giving up control of basic tools and processes we use to tune our environment was extremely painful.</p>
</blockquote>
<p>While the support that comes with Platform-as-a-Service was acknowledged, it seems that the ability to better automate, adjust, and perpetuate processes and systems with more fine-grained control won out as a bigger convenience.</p>
</p>
<h2>Where did you get that lovely platform?</h2>
</p>
<p><a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-strata-week-121610"><img src="http://s.radar.oreilly.com/strata11-promo-radar.png" border="0" alt="Strata 2011" style="float: right;margin: 3px 0 10px 10px"></a>Of course, one company&#8217;s headache is another&#8217;s dream. <a href="http://movies.netflix.com/">Netflix</a>, a company known for playing with big data and <a href="http://en.wikipedia.org/wiki/Netflix_Prize">crowdsourcing solutions</a> &#8220;before it was cool,&#8221; posted on Tuesday the <a href="http://techblog.netflix.com/2010/12/four-reasons-we-choose-amazons-cloud-as.html">four reasons</a> they&#8217;ve chosen to use <a href="http://aws.amazon.com/">Amazon Web Services</a> (AWS) as their platform and have moved onto it over the last year.</p>
<p>Laudably, the company states that it viewed its tremendous recent growth (in terms of both members and streaming devices) as a license to <em>question everything</em> in the necessary process of re-architecting. Instead of building out their own data centers, etc., they decided to answer that set of questions by paying someone else to worry about it.</p>
<p>Also to their credit, Netflix has enough self-awareness to know what they are and aren&#8217;t good at. Building top-notch recommendation systems and providing entertainment? You betcha. Predicting customer growth and device engagement? Not so much.</p>
<blockquote><p>How many subscribers would you guess used our Wii application the week it launched? How many would you guess will use it next month? We have to ask ourselves these questions for each device we launch because our software systems need to scale to the size of the business, every time.</p>
</blockquote>
<p>Self-awareness is in fact the primary lesson in both Netflix&#8217;s and Stack Exchange&#8217;s platform decisions. If you feel your attention is better spent elsewhere, write a check. If you&#8217;ve got the time and expertise to hone your hardware, roll your own.</p>
<p>[Of course, Netflix doesn't go for the pre-packaged solutions every time. They also posted recently about <a href="http://techblog.netflix.com/2010/12/why-we-use-and-contribute-to-open.html">why they love open source software</a>, and listed among the projects they make use of and contribute back to: <a href="http://hadoop.apache.org/">Hadoop</a>, <a href="http://wiki.apache.org/hadoop/Hive">Hive</a>, <a href="http://hbase.apache.org/">HBase</a>, <a href="https://github.com/jboulon/honu/wiki">Honu</a>, <a href="http://ant.apache.org/">Ant</a>, <a href="http://tomcat.apache.org/">Tomcat</a>, <a href="http://java-source.net/open-source/build-systems/hudson">Hudson</a>, <a href="http://java-source.net/open-source/build-systems/ivy">Ivy</a>, <a href="http://cassandra.apache.org/">Cassandra</a>, etc.]</p>
</p>
<h2>With what shall we shop?</h2>
</p>
<p>The <a href="http://www.nytimes.com/">New York Times</a> this week released a cool group of <a href="http://projects.nytimes.com/census/2010/explorer">interactive maps</a> based on data collected in the Census Bureau&#8217;s <a href="http://www.census.gov/acs/www/">American Community Survey</a> (ACS) from 2005 to 2009. Data is compared against the 2000 census to uncover rates of change.</p>
<p>[While similar to the census, the ACS is conducted every year instead of every 10 years.  The ACS includes only a sampling of addresses instead of a comprehensive inventory. It covers much of the same ground on population (age, race, disability status, family relationships), but it also asks for information that is used to help make funding distribution decisions about community services and institutions.]</p>
<p>The Times maps explore education levels; rent, mortgage rates, and home values; household income; and racial distribution. Viewers can select among 22 maps in these four categories, and then pan and zoom to view national, state, or local trends down to the level of individual census tracts.</p>
<div align="center">
<p class="image-box-450"><img src="http://s.radar.oreilly.com/assets_c/2010/12/NYTimesMap-thumb-486x277.png" width="450" />
</p>
</div>
<p>Above is the national view of the map that looks at change in median household income. The ACS website itself provides some maps displaying the survey numbers from the <a href="http://www.census.gov/acs/www/Downloads/data_documentation/2009_acs_maps/MedianHouseholdIncomeCensus2000.pdf">2000</a> census and the <a href="http://www.census.gov/acs/www/Downloads/data_documentation/2009_acs_maps/MedianHouseholdIncomeACS.pdf">2005-2009</a> survey, as well as a listing of <a href="http://www.census.gov/acs/www/data_documentation/2009_acs_maps/">data tables</a>.</p>
<p>The Times map shows the uneven way in which these numbers have gone up or down in various parts of the country, with some surprising results that are worth exploring. Note that the blue regions are places where income has dropped, and the yellow regions are places where it has increased. (No wonder a lot of us are getting creative with holiday shopping.)</p>
<p>If this kind of research floats your boat, check out <a href="http://www.socialexplorer.com/pub/home/home.aspx">Social Explorer</a>, the mapping tool used to create the New York Times maps.</p>
</p>
<h2>Even markets like to buy things</h2>
</p>
<p>The emerging landscape of custom data markets is already shifting as <a href="http://infochimps.com/">Infochimps</a> recently announced the acquisition of <a href="http://datamarketplace.com/">Data Marketplace</a>, a start-up incubated at <a href="http://www.ycombinator.com/">Y Combinator</a>.</p>
<p>While <a href="http://en.wikipedia.org/wiki/Stewart_Brand">Stewart Brand</a> may be right in thinking information wants to be free, there&#8217;s also enormous value to be added by aggregating, structuring, and packaging data, as well as in matching up buyers with sellers. That&#8217;s the main service <a href="http://techcrunch.com/2010/03/18/yc-funded-data-marketplace-is-an-amazon-for-structured-financial-information/">Data Marketplace aims to provide</a>, particularly in the field of financial data.</p>
<p>At Infochimps, information is offered a la carte, and many of the site&#8217;s datasets are offered for free.  These include sets as diverse as &#8220;<a href="http://infochimps.org/datasets/word-list---100000-official-crossword-words-excel-readable">Word List &#8211; 100,000+ official crossword words (Excel readable)</a>&#8220;, &#8220;<a href="http://infochimps.org/datasets/measuring-worth-interest-rates-us-uk-1790-2007">Measuring Worth: Interest Rates &#8211; US &amp; UK 1790-2000</a>&#8220;, and &#8220;<a href="http://infochimps.org/datasets/retrosheet-game-logs-play-by-play-for-major-league-baseball-game">Retrosheet: Game Logs (play-by-play) for Major League Baseball Games</a>.&#8221; Data Marketplace is a bit different, in that it allows users to enter requests for data (with a deadline and budget, if desired) and then matches up would-be buyers with data providers.</p>
<p>Infochimps <a href="http://blog.infochimps.com/2010/12/14/infochimps-acquires-datamarketplace-com/">has said</a> that Data Marketplace, which is less than a year old, will continue to operate as a standalone site, although its founders Steve DeWald and Matt Hodan will depart for new projects.</p>
<p>If you&#8217;re interested in the burgeoning business of aggregated datasets, be sure to check out the <a href="http://strataconf.com/strata2011/public/schedule/detail/17604/?cmp=il-radar-st11-strata-week-121610">Data Marketplaces panel</a> I&#8217;ll be moderating at <a href="http://strataconf.com/?cmp=il-radar-st11-strata-week-121610">Strata</a> in February.</p>
<p>Not yet signed up for Strata? <a href="https://en.oreilly.com/strata2011/public/register?cmp=il-radar-st11-strata-week-121610"><strong>Register now and save 30% with the code STR11RAD</strong></a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://radar.oreilly.com/2010/12/strata-week-shop-til-you-drop.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
