"NLP" entries

Four short links: 17 August 2010

Four short links: 17 August 2010

Stemming Demo, Mapping Service, Value of Data, and The Magic of the Valley

  1. Demo of Stemming Algorithms — type in text and see what it looks like when stemmed with different algorithms provided by NLTK. (via zelandiya on Twitter)
  2. Crowdmap — hosted Ushahidi. (via dvansickle on Twitter)
  3. Opinions vs Data — talks about the usability of a new gmail UI element, but notable for this quote from Jakob Nielsen: In my two examples, the probability of making the right design decision was vastly improved when given the tiniest amount of empirical data. (via mcannonbrookes on Twitter)
  4. The Next Silicon Valley — long and detailed list of the many forces contributing to Silicon Valley’s success as tech hub, arguing that the valley’s position is path-dependent and can simply be grown ab initio in some aspiring nation’s co-prosperity zone of policy whim. (via imran and timoreilly on Twitter)
Four short links: 10 August 2010

Four short links: 10 August 2010

Rational Smoking, Latency Poor, NLP Cites, Security Podcast

  1. Smoking and Ill Health: Does Lay Epidemiology Explain the Failure of Smoking Cessation Programs Among Deprived Populations?Here we pose the question of whether the poorer life chances of those who continue to smoke in effect constitute a rational disincentive to their avoidance or cessation of smoking. (via bengoldacre on Twitter)
  2. Scaling the New Bar for Latency in Financial NetworksSince the first trade to the market gets the best price, the delivery of a buy or sell order must be as fast as possible. Just a little more than a year ago, firms were concentrating on removing milliseconds from their network; today, a mere 250 nanoseconds make a difference. (via economicsnz on Twitter)
  3. Cataloging Bibliographic Data with Natural Language and RDF (OKFN) — In the grand tradition of W3C IRC bots, I’ve started some speculative work on a robot that tries to understand natural language descriptions of works and their authors and generates RDF. It is written in Python and uses ORDF, the NLTK and FuXi.
  4. Eurotrash Security — European infosec podcast. Latest episode features Ivan Ristic on SSL. (via ivanristic on Twitter)