|
|
|||||
Strata Week: Simplifying MapReduce through JavaMapReduce gets easier, a new search engine for data, and now you can monitor the universe's forces on your phone.Here are a few of the data stories that caught my attention this week: Crunch looks to make MapReduce easierDespite the growing popularity of MapReduce and other data technologies, there's still a steep learning curve associated with these tools. Some have even wondered if they're worth introducing to programming students. All of this makes the introduction of Crunch particularly good news. Crunch is a new Java library from Cloudera that's aimed at simplifying the writing, testing, and running of MapReduce pipelines. In other words, developers won't need to write a lot of custom code or libraries, which as Cloudera data scientist Josh Willis points out, "is a serious drain on developer productivity." He adds that:
The Crunch library has been released under the Apache license, and the code can be downloaded here. Web 2.0 Summit, being held October 17-19 in San Francisco, will examine "The Data Frame" — focusing on the impact of data in today's networked economy.Save $300 on registration with the code RADAR Querying the web with Datafiniti
Datafiniti enables its users to enter a search query (or make an API call) against the web. Or, that's the goal at least. As it stands, Datafiniti lets users make calls about location, products, news, real estate, and social identity. But that's a substantial number of datasets, using information that's publicly available on the web. Although Datafiniti demands you enter SQL parameters, it tries to make the process of doing so fairly easy, with a guide that pops up beneath the search box to help you phrase things properly. That interface is just one of the indications that Datafiniti is making a move to help democratize big data search. The company grew out of a previous startup named 80Legs. As Shion Deysarker, founder of Datafiniti told me, it was clear that the web-crawling services provided by 80Legs were really just being utilized to ask specific queries. Things like, what's the average listing price for a home in Houston? How many times has a brand name been mentioned on Twitter or Facebook over the last few months? And so on. Deysarker frames Datafiniti in terms of data access, arguing that until now a few providers have controlled the data. The startup wants to help developers and companies overcome both access and expense issues associated with gathering, processing, curating and accessing datasets. It plans to offer both subscription-based and unit-based pricing. Keep tabs on the Large Hadron Collider from your smartphone
The ATLAS experiment describes itself as an effort to learn about "the basic forces that have shaped our Universe since the beginning of time and that will determine its fate. Among the possible unknowns are the origin of mass, extra dimensions of space, unification of fundamental forces, and evidence for dark matter candidates in the Universe." The LHSee app provides detailed information into how CERN and the Large Hadron Collider work. It also offers a "Hunt the Higgs Boson" game as well as opportunities to watch 3-D collisions streamed live from CERN. The app is available for free through the Android Market. Got data news?Feel free to email me. Related: |
|||||
|
|||||
Comments: 1
Antonio Piccolboni [13 October 2011 10:26 AM]
For a Crunch-equivalent for R, open source from Revolution Analytics, follow the link
https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr