|
|
|||||
Strata Week: Genome research kicks up a lot of dataWhere to store all that genome data? Also, clarifying the work of digital humanities scholars.Here are a few of the data stories that caught my attention this week. Genomics data and the cloud
But as Harris observes, the promise of quick and cheap genomics is leading to other problems, particularly as the data reaches a heady scale. A fully sequenced human genome is about 100GB of raw data. But citing DNAnexus founder Andreas Sundquist, Harris says that:
That makes the promise of a $1,000 genome sequencing service challenging when it comes to storing and processing petabytes of data. Harris posits that it will be cloud computing to the rescue here, providing the necessary infrastructure to handle all that data. Strata 2012 — The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.Save 20% on registration with the code RADAR20 Stanley Fish versus the digital humanitiesLiterary critic and New York Times opinionator Stanley Fish has been on a bit of a rampage in recent weeks, taking on the growing field of the "digital humanities." Prior to the annual Modern Language Association meeting, Fish cautioned that alongside the traditional panels and papers on Ezra Pound and William Shakespeare and the like, there were going to be a flood of sessions devoted to:
That "everything" was narrowed down substantially in Fish's editorial this week, in which he blasted the digital humanities for what he sees as its fixation "with matters of statistical frequency and pattern." In other words: data and computational analysis. According to Fish, the problem with digital humanities is that this new scholarship relies heavily on the machine — and not the literary critic — for interpretation. Fish contends that digital humanities scholars are all teams of statisticians and positivists, busily digitizing texts so they can data-mine them and systematically and programmatically uncover something of interest — something worthy of interpretation. University of Illinois, Urbana-Champaign English professor Ted Underwood argues that Fish not only mischaracterizes what digital humanities scholars do, but he misrepresents how his own interpretive tradition works:
One of the most interesting responses to Fish's recent rants about the humanities' digital turn comes from University of North Carolina English professor Daniel Anderson, who demonstrates in the following video a far fuller picture of what "digital" "data" — creation and interpretation — looks like: Hadoop World merges with O'Reilly's Strata New York conferenceTwo of the big data events announced they'll be merging this week: Hadoop World will now be part of the Strata Conference in New York this fall. [Disclosure: The Strata events are run by O'Reilly Media.] Cloudera first started Hadoop World back in 2009, and as Hadoop itself has seen increasing adoption, Hadoop World, too, has become more popular. Strata is a newer event — its first conference was held in Santa Clara, Calif., in February 2011, and it expanded to New York in September 2011. With the merger, Hadoop World will be a featured program at Strata New York 2012 (Oct. 23-25). In other Hadoop-related news this week, Strata chair Edd Dumbill took a close look at Microsoft's Hadoop strategy. Although it might be surprising that Microsoft has opted to adopt an open source technology as the core of its big data plans, Dumbill argues that:
Also, Cloudera data scientist Josh Willis takes a closer look at one aspect of that ecosystem: the work of scientists whose research falls outside of statistics and machine learning. His blog post specifically addresses one use case for Hadoop — seismology, for which there is now Seismic Hadoop — but the post also provides a broad look at what constitutes the practice of data science. Got data news?Feel free to email me. Photo: Bootstrap DNA by Charles Jencks, 2003 by mira66, on Flickr Related: |
|||||
|
|||||
Comments: 1
AP [26 January 2012 09:58 AM]
From the Life Technologies announcement: "The Ion Proton™ Sequencer and Ion Reporter analysis software are designed to analyze a single genome in one day on a stand-alone server ". No big data for them.