"statistics" entries

Four short links: 19 May 2011

Four short links: 19 May 2011

Internet Access Rights, Statistical Peace, Vintage Jobs, and Errata Etymology

  1. Right to Access the Internet — a survey of different countries’ rights to access to access the Internet.
  2. Peace Through Statistics — three ex-Yugoslavian statisticians nominated for Nobel Peace Prize. In war-torn and impoverished countries, statistics provides a welcome arena in which science runs independent of ethnicity and religion. With so few resources, many countries are graduating few, if any, PhDs in statistical sciences. These statisticians collaboratively began a campaign to collect together the basics underlying statistics and statistics education, with the hope of increasing access to statistical ideas, knowledge and training around the world.
  3. Vintage Steve Jobs (YouTube) — he’s launching the “Think Different” campaign, but it’s a great reminder of what a powerful speaker he is and a look at how he thinks about marketing.
  4. Anatomy of a Fake Quotation (The Atlantic) — deconstructing how the words of a 24 year old English teacher in Japan sped around the world, attributed to Martin Luther King.
Four short links: 21 March 2011

Four short links: 21 March 2011

Javascript Master Class, Stats for Pythonistas, CAM Floor, and HTML Extraction

  1. Javascript Trie Performance Analysis (John Resig) — if you program in Javascript and you’re not up to John’s skill level (*cough*) then you should read this and follow along. It’s a ride-along in the brain of a master.
  2. Think Stats — an introduction to statistics for Python programmers. (via Edd Dumbill)
  3. Bolefloor — they build curvy wooden floors. Instead of straightening naturally curvy wood (which is wasteful), they use CV and CAD/CAM to figure the smallest cuts to slot strips of wood together. It’s gorgeous, green, and geeky. (via BoingBoing)
  4. Extracting Article Text from HTML Documents — everyone’s doing it, now you know how. It’s the theory behind the lovingly hand-crafted magic of readability. (via Hacker News)
Four short links: 10 January 2011

Four short links: 10 January 2011

Online Collaboration, Reputable Twitterers, Old Computers, and Web Experiments

  1. Tools and Practices for Working Virtually — a detailed explanation of how the RedMonk team works virtually.
  2. Twitter Accounts for All Stack Overflow Users by Reputation (Brian Bondy) — superawesome list of clueful people.
  3. The Wonderful World of Early Computing — from bones to the ENIAC, some surprising and interesting historical computation devices. (via John D. Cook)
  4. Overlapping Experiment Infrastructure (PDF) — they can’t run just one test at a time, so they have infrastructure to comprehensively test all features against all features and in real time pull out statistical conclusions from the resulting data. (via Greg Linden)
Four short links: 31 December 2010

Four short links: 31 December 2010

Statistics, Tech Writing, Shared Spaces, and Delicious Exodus

  1. The Joy of Stats — Hans Rosling’s BBC documentary on statistics, available to watch online.
  2. Best Tech Writing of 2010 — I need a mass “add these to Instapaper” button. (via Hacker News)
  3. Google Shared Spaces: Why We Made It (Pamela Fox) — came out of what people were trying to do with Google Wave.
  4. The Great Delicious Exodus — traffic graph as experienced by pinboard.

Strata Week: Running the numbers

IA Ventures success, MathJax display engine, statistical literacy, and making big data more human

IA Ventures raises a huge first-time fund; MathJax provides an open source mathematical display engine; Kevin Drum shares 10 statistics pitfalls; and Paul Bradshaw explains how to bring big data down to a human scale.

Strata Week: Statistically speaking

Trading platforms, truth in graphs, European financial stats, and Mandelbrot's passing.

In this edition of Strata Week: The London Stock Exchange moves from .Net to open source; learn how graphical scales can lie; the Euroean Central Bank president calls for better financial statistics; and we bid farewell to the father of fractals.

Four short links: 21 September 2010

Four short links: 21 September 2010

Logic-less Templates, Amazon Story, Visualizing Time Data, and Statistics Primers

  1. Mustache — templates without the if/then/loop control structures that mangle your separation of logic. (via the technology behind #newtwitter)
  2. The Visionary’s Lament (Eric Ries) — love the possibly apocryphal Amazon story about the invention of one-click.
  3. TimeFlowhelps you analyze temporal data. Timeline, Calendar, Bar Chart, Table, and List views. From the legendary team of Viegas and Wattenberg
  4. Basic Statistical Literacy — the UK government has some good introductions to statistics. (via Flowing Data)
Four short links: 25 August 2010

Four short links: 25 August 2010

Narrative and Structure, Teaching Science, Time-Series Statistics, and Who Benefits from Open Source

  1. Why Narrative and Structure are Important (Ed Yong) — Ed looks at how Atul Gawande’s piece on death and dying, which is 12,000 words long, is an easy and fascinating read despite the length.
  2. Understanding Science (Berkeley) — simple teaching materials to help students understand the process of science. (via BoingBoing comments)
  3. Sax: Symbolic Aggregate approXimationSAX is the first symbolic representation for time series that allows for dimensionality reduction and indexing with a lower-bounding distance measure. In classic data mining tasks such as clustering, classification, index, etc., SAX is as good as well-known representations such as Discrete Wavelet Transform (DWT) and Discrete Fourier Transform (DFT), while requiring less storage space. In addition, the representation allows researchers to avail of the wealth of data structures and algorithms in bioinformatics or text mining, and also provides solutions to many challenges associated with current data mining tasks. One example is motif discovery, a problem which we recently defined for time series data. There is great potential for extending and applying the discrete representation on a wide class of data mining tasks. Source code has “non-commercial” license. (via rdamodharan on Delicious)
  4. Open Source OSCON (RedMonk) — The business of selling open source software, remember, is dwarfed by the business of using open source software to produce and sell other services. And yet historically, most of the focus on open source software has accrued to those who sold it. Today, attention and traction is shifting to those who are not in the business of selling software, but rather share their assets via a variety of open source mechanisms. (via Simon Phipps)
Four short links: 17 June 2010

Four short links: 17 June 2010

Statistical Jeopardy Wins, Mobile Taxonomy, Geodata Mystery, and Machine Learning Blog

  1. What is IBM’s Watson? (NY Times) — IBM joining the big data machine learning race, and hatching a Blue Gene system that can answer Jeopardy questions. Does good, not great, and is getting better.
  2. Google Lays Out its Mobile Strategy (InformationWeek) — notable to me for Rechis said that Google breaks down mobile users into three behavior groups: A. “Repetitive now” B. “Bored now” C. “Urgent now”, a useful way to look at it. (via Tim)
  3. BP GIS and the Mysteriously Vanishing Letter — intrigue in the geodata world. This post makes it sound as though cleanup data is going into a box behind BP’s firewall, and the folks who said “um, the government should be the depot, because it needs to know it has a guaranteed-untampered and guaranteed-able-to-access copy of this data” were fired. For more info, including on the data that is available, see the geowanking thread.
  4. Streamhacker — a blog talking about text mining and other good things, with nltk code you can run. (via heraldxchaos on Delicious)
Four short links: 24 May 2010

Four short links: 24 May 2010

Google Docs APIs, Wikileaks Founder Profile, DNA Hacking, and Abusing the Numbers

  1. Appscale — open source implementation of Google App engine’s APIs built on top of Amazon’s APIs, from UCSB. You can deploy on Amazon or on any Amazon API-compliant cloud such as Eucalyptus.
  2. Information Pioneers — the Chartered Institute for IT has a pile of video clips about famous IT pioneers (Lovelace, Turing, Lamarr, Berners-Lee, etc.).
  3. This Week in Law — podcast from Denise Howell, covering IT law and policy. E.g., this week’s episode covers “Google Books, Elena Kagen, owning virtual land, double-dipping game developers, Facebook tips, forced follow bug and fragile egos, embedding tweets, Star Trek Universe liability, and more.”
  4. <a href="