ENTRIES TAGGED "algorithm"

Strata Week: Add structured data, lose local flavor?

Strata Week: Add structured data, lose local flavor?

Wikidata's structure vs. diverse knowledge, and a look at the many factors behind Netflix's recommendations.

A critic says Wikidata could undermine Wikipedia's localized information. Also, Netflix explains why its recommendation engine is much more complicated than most people realize.

Read Full Post | Comment |

AI will eventually drive healthcare, but not anytime soon

A merging of artificial intelligence and healthcare is tougher than many realize.

People will eventually get better care from artificial intelligence, but for now, we should keep the algorithms focused on the data that we know is good and keep the doctors focused on the patients.

Read Full Post | Comments: 5 |
Strata Week: Unfortunately for some, Uber's dynamic pricing worked

Strata Week: Unfortunately for some, Uber's dynamic pricing worked

Dynamic pricing angers some Uber users, Hadoop hits 1.0, a possible set back for open-access research.

Uber's dynamic pricing worked as intended on New Year's Eve, but not everyone is happy about that. Elsewhere, Hadoop reaches the 1.0 milestone and proposed legislation seeks to repeal an open-access research policy.

Read Full Post | Comment |
Four short links: 5 December 2011

Four short links: 5 December 2011

Spatial Search, Exposing Your Phone's Perfidity, School Unconference, and Wikipedia Viz

  1. VP Trees — a data structure for fast spatial searching. A form of nearest neighbour, useful for melodies (PDF) and image retrieval (PDF) and poetry. (via Reddit)
  2. iYou — iTunes plugin to show you all the stuff your phone collects about you.
  3. Bar Camps in Primary Schools — NZ teacher deploys bar camps among students. Great things happen.
  4. Realtime Wikipedia Edits — fascinating and hypnotic and inspirational and appalling and irrelevant all at once.

Comment: 1 |
Four short links: 18 November 2011

Four short links: 18 November 2011

Quantified Learner, Text Extraction, Backup Flickr, and Multitouch UI Awesomeness

  1. Learning With Quantified Self — this CS grad student broke Jeopardy records using an app he built himself to quantify and improve his ability to answer Jeopardy questions in different categories. This is an impressive short talk and well worth watching.
  2. Evaluating Text Extraction AlgorithmsThe gold standard of both datasets was produced by human annotators. 14 different algorithms were evaluated in terms of precision, recall and F1 score. The results have show that the best opensource solution is the boilerpipe library. (via Hacker News)
  3. Parallel Flickr — tool for backing up your Flickr account. (Compare to one day of Flickr photos printed out)
  4. Quneo Multitouch Open Source MIDI and USB Pad (Kickstarter) — interesting to see companies using Kickstarter to seed interest in a product. This one looks a doozie: pads, sliders, rotary sensors, with LEDs underneath and open source drivers and SDK. Looks almost sophisticated enough to drive emacs :-)
Comment |
Four short links: 14 October 2011

Four short links: 14 October 2011

Relativity in Short Words, Set Math, Design Inspiration, and Internet of Things

  1. Theory of Relativity in Words of Four Letters or Less — this does just what it says, and well too. I like it, as you may too. At the end, you may even know more than you do now.
  2. Effective Set Reconciliation Without Prior Context (PDF) — paper on using Bloom filters to do set union (deduplication) efficiently. Useful in distributed key-value stores and other big data tools.
  3. Mental Notes — each card has an insight from psychology research that’s useful with web design. Shuffle the deck, peel off a card, get ideas for improving your site. (via Tom Stafford)
  4. The Internet of Things To Come (Mike Kuniavsky) — Mike lays out the trends and technologies that will lead to an explosion in Internet of Things products. E.g., This abstraction of knowledge into silicon means that rather than starting from basic principles of electronics, designers can focus on what they’re trying to create, rather than which capacitor to use or how to tell the signal from the noise. He makes it clear that, right now, we have the rich petrie dish in which great networked objects can be cultured.
Comment |
Four short links: 18 July 2011

Four short links: 18 July 2011

Organisational Warfare, RTFM, Timezone Shapefile, Microsoft Adventure

  1. Organisational Warfare (Simon Wardley) — notes on the commoditisation of software, with interesting analyses of the positions of some large players. On closer inspection, Salesforce seems to be doing more than just commoditisation with an ILC pattern, as can be clearly seen from Radian’s 6 acquisition. They also seem to be operating a tower and moat strategy, i.e. creating a tower of revenue (the service) around which is built a moat devoid of differential value with high barriers to entry. When their competitors finally wake up and realise that the future world of CRM is in this service space, they’ll discover a new player dominating this space who has not only removed many of the opportunities to differentiate (e.g. social CRM, mobile CRM) but built a large ecosystem that creates high rates of new innovation. This should be a fairly fatal combination.
  2. Learning to Win by Reading Manuals in a Monte-Carlo Framework (MIT) — starting with no prior knowledge of the game or its UI, the system learns how to play and to win by experimenting, and from parsed manual text. They used FreeCiv, and assessed the influence of parsing the manual shallowly and deeply. Trust MIT to turn RTFM into a paper. For human-readable explanation, see the press release.
  3. A Shapefile of the TZ Timezones of the World — I have nothing but sympathy for the poor gentleman who compiled this. Political boundaries are notoriously arbitrary, and timezones are even worse because they don’t need a war to change. (via Matt Biddulph)
  4. Microsoft Adventure — 1979 Microsoft game for the TRS-80 has fascinating threads into the past and into what would become Microsoft’s future.
Comment |
Four short links: 28 June 2011

Four short links: 28 June 2011

Mediasaurus Dix, Mobile Numbers, Machine Learning, and Software Patents

  1. Networks Blocking Google TV — the networks are carrying over their old distribution models: someone aggregates eyeballs and pays them for access. In their world view, Google TV is just another cable company. They’re doubling down on this wholesale model, pulling out of Hulu and generally avoiding dealing with the people who ultimately watch their shows except through ad-filled shows on their corporate sites. (via Gina Trapani)
  2. Mobile Market Snippets — lots of numbers collected by Luke Wroblewski. After the Verizon iPhone launched in the U.S., Android suffered its first quarterly decline. Apple’s share of the U.S. smartphone market gained 12.3% to 29.5% in the March quarter while Android’s share in the U.S. fell from 52.4% to 49.5% — its first sequential loss in any region of the world since early 2009. The post has lots more like that.
  3. Unsupervised Feature Learning and Deep Learning TutorialThis tutorial will teach you the main ideas of Unsupervised Feature Learning and Deep Learning. By working through it, you will also get to implement several feature learning/deep learning algorithms, get to see them work for yourself, and learn how to apply/adapt these ideas to new problems.
  4. A Generation of Software PatentsThis report examines changes in the patenting behavior of the software industry since the 1990s. It finds that most software firms still do not patent, most software patents are obtained by a few large firms in the software industry or in other industries, and the risk of litigation from software patents continues to increase dramatically. Given these findings, it is hard to conclude that software patents have provided a net social benefit in the software industry.
Comment: 1 |
Four short links: 22 June 2011

Four short links: 22 June 2011

DOM Snitch, Hadoop in Scala, Pregel in Hadoop in Scala, Reflections on the Company

  1. DOM Snitchan experimental Chrome extension that enables developers and testers to identify insecure practices commonly found in client-side code. See also the introductory post. (via Hacker News)
  2. Spark — Hadoop-alike in Scala. Spark was initially developed for two applications where keeping data in memory helps: iterative algorithms, which are common in machine learning, and interactive data mining. In both cases, Spark can outperform Hadoop by 30x. However, you can use Spark’s convenient API to for general data processing too. (via Hilary Mason)
  3. Bagelan implementation of the Pregel graph processing framework on Spark. (via Oliver Grisel)
  4. Week 315 (Matt Webb) — read this entire post. It will make you smarter. The company’s decisions aren’t actually the shareholders’ decisions. A company has a culture which is not the simple sum of the opinions of the people in it. A CEO can never be said to perform an action in the way that a human body can be said to perform an action, like picking an apple. A company is a weird, complex thing, and rather than attempt (uselessly) to reduce it to people within it, it makes more sense – to me – to approach it as an alien being and attempt to understand its biology and momentums only with reference to itself. Having done that, we can then use metaphors to attempt to explain its behaviour: we can say that it follows profit, or it takes an innovative step, or that it is middle-aged, or that it treats the environment badly, or that it takes risks. None of these statements is literally true, but they can be useful to have in mind when attempting to negotiate with these bizarre, massive creatures. If anyone wonders why I link heavily to BERG’s work, it’s because they have some incredibly thoughtful and creative people who are focused and productive, and it’s Webb’s laser-like genius that makes it possible. They’re doing a lot of subtle new things and it’s a delight and privilege to watch them grow and reflect.
Comments Off |
Algorithms are the new medical tests

Algorithms are the new medical tests

How data and algorithms help doctors make use of real-time data.

Predictive Medical Technologies says its new system can use real-time, intensive care unit monitoring data to predict cardiac arrest and other events up to 24 hours ahead of time. CEO Bryan Hughes discusses the system and the application of diagnostic data in this interview.

Read Full Post | Comments: 2 |