"natural language processing" entries

Four short links: 29 October 2014

Four short links: 29 October 2014

Tweet Parsing, Focus and Money, Challenging Open Data Beliefs, and Exploring ISP Data

  1. TweetNLP — CMU open source natural language parsing tools for making sense of Tweets.
  2. Interview with Google X Life Science’s Head (Medium) — I will have been here two years this March. In nineteen months we have been able to hire more than a hundred scientists to work on this. We’ve been able to build customized labs and get the equipment to make nanoparticles and decorate them and functionalize them. We’ve been able to strike up collaborations with MIT and Stanford and Duke. We’ve been able to initiate protocols and partnerships with companies like Novartis. We’ve been able to initiate trials like the baseline trial. This would be a good decade somewhere else. The power of focus and money.
  3. Schooloscope Open Data Post-MortemThe case of Schooloscope and the wider question of public access to school data challenges the belief that sunlight is the best disinfectant, that government transparency would always lead to better government, better results. It challenges the sentiments that see data as value-neutral and its representation as devoid of politics. In fact, access to school data exposes a sharp contrast between the private interest of the family (best education for my child) and the public interest of the government (best education for all citizens).
  4. M-Lab Observatory — explorable data on the data experience (RTT, upload speed, etc) across different ISPs in different geographies over time.
Comment
Four short links: 30 July 2014

Four short links: 30 July 2014

Offline First, Winograd Schemata, Jailbreaking Nest for Privacy, and Decentralised Web Cache

  1. Offline First is the New Mobile First — Luke Wroblewski’s notes from John Allsopp’s talk about “Breaking Development” in Nashville. Offline technologies don’t just give us sites that work offline, they improve performance, and security by minimizing the need for cookies, http, and file uploads. It also opens up new possibilities for better user experiences.
  2. Winograd Schemas as Alternative to Turing Test (IEEE) — specially constructed sentences that are surface ambiguous and require deeper knowledge of the world to disambiguate, e.g. “Jim comforted Kevin because he was so upset. Who was upset?”. Our WS [Winograd schemas] challenge does not allow a subject to hide behind a smokescreen of verbal tricks, playfulness, or canned responses. Assuming a subject is willing to take a WS test at all, much will be learned quite unambiguously about the subject in a few minutes. (that last from the paper on the subject)
  3. Reclaiming Your Nest (Forbes) — Like so many connected devices, Nest devices regularly report back to the Nest mothership with usage data. Over a month-long period, the researchers’ device sent 32 MB worth of information to Nest, including temperature data, at-rest settings, and self-entered information about the home, such as how big it is and the year it was built. “The Nest doesn’t give us an option to turn that off or on. They say they’re not going to use that data or share it with Google, but why don’t they give the option to turn it off?” says Jin. Jailbreak your Nest (technique to be discussed at Black Hat), and install less chatty software. Loose Lips Sink Thermostats.
  4. SyncNet — decentralised browser: don’t just pull pages from the source, but also fetch from distributed cache (implemented with BitTorrent Sync).
Comment: 1

Google I/O 2013: Android Studio, Google Play Music: All Access, and New Advances in Search

My day one experience

io
While there was no skydiving this year to show off Google’s new wearable Glass, there were plenty of attendees wearing them proudly including me. This year hardware, however, didn’t take center stage. The focus was on new tools and upgrades to existing products and platforms.

Android developers were thrilled to see new APIs and tools. The biggest cheers, at least in my section, were for Android Studio built on IntelliJ which from what I can tell is way better than Eclipse but notably not open source. The Developer Console got a substantial update with integrated translation services, user metrics, and revenue graphs, but what really made a big splash the beta testing and staged rollout facilitation. These along with new location and gaming APIs rounded out the new offering for the Android development crowd.

Read more…

Comment

Unstructured data is worth the effort when you've got the right tools

Alyona Medelyan and Anna Divoli on the opportunities in chaotic data.

Alyona Medelyan and Anna Divoli are inventing tools to help companies contend with vast quantities of fuzzy data. They discuss their work and what lies ahead for big data in this interview.

Comment

Unstructured data is worth the effort when you’ve got the right tools

Alyona Medelyan and Anna Divoli on the opportunities in chaotic data.

Alyona Medelyan and Anna Divoli are inventing tools to help companies contend with vast quantities of fuzzy data. They discuss their work and what lies ahead for big data in this interview.

Comment
"We need tools that can help people have their ideas faster"

"We need tools that can help people have their ideas faster"

Aditi Muralidharan on improving discovery and building intuition into search.

Ph.D. student Aditi Muralidharan aims to make life easier for researchers and scientists with WordSeer, a text analysis tool that examines and visualizes language use patterns.

Comment: 1