- An Illustrated Guide to Crypographic Hashes — exactly what it says: learn how hashing works and how you’d use it for passwords, digital signatures, etc.
- The Age of Fanfiction — We live in a time where copyright means very little to younger people, and it’s not just because they want free movies or free music. More than that, they want to be able to play with the amazing toys that they’ve been given by filmmakers and comic book writers and TV creators, and they want to do so without the constraints that copyright creates. Eloquent and thoughtful piece on what this means for Hollywood and how “the Age of Fanfiction is reflected in what Hollywood’s making. (via Sacha Judd)
- How Khan Academy is Using Machine Learning to Assess Student Mastery — it is bloody hard to know when a student has mastered a subject, both for real live teachers and for roboteachers like Khan Academy. This is a detailed discussion of a change in assessment within Khan Academy. if we define proficiency as your chance of getting the next problem correct being above a certain threshold, then the streak becomes a poor binary classifier. Experiments conducted on our data showed a significant difference between students who take, say, 30 problems to get a streak vs. 10 problems right off the bat — the former group was much more likely to miss the next problem after a break than the latter.
- In Which I Declare Four Things My Probability Class is Not About — a reminder of the assumptions we make when we use numerical analysis to understand a problem.
ENTRIES TAGGED "stats"
The work of data journalists and a comparison of four data markets.
This week's data news includes a look at the work of various data journalists, Edd Dumbill surveys four data marketplaces, and the MIT Sloan Sports Analytics Conference experiences impressive growth.
Crypography Illustrated, Hollywood Futures, Machine Learning Mastery, and Analytics Assumptions
Ubicomp Project, Data Volumes, Yahoo! Cocktails, and Fighting Cybercrime
- Twine (Kickstarter) — modular sensors with connectivity, programmable in If This Then That style. (via TechCrunch)
- Small Sample Sizes Lead to High Margins of Error — a reminder that all the stats in the world won’t help you when you don’t have enough data to meaningfully analyse.
- UK Govt To Help Businesses Fight Cybercrime (Guardian) — I view this as a good thing, even though the conspiracy nut in me says that it’s a step along the path that ends with the spy agency committing cybercrime to assist businesses.
CPAN's Sweet 0x10, Social Reading, Questioning Polls, and 3D Manufacturing
- CPAN Turns 0×10 — sixteenth anniversary of the creation of the Comprehensive Perl Archive Network. Now holds 480k objects.
- Subtext — social bookreading by adding chat, links, etc. to a book. I haven’t tried the implementation yet but I’ve wanted this for years. (Just haven’t wanted to jump into the cesspool of rights negotiations enough to actually build it :-) (via David Eagleman)
- Questions to Ask about Election Polls — information to help you critically consume data analysis. (via Rachel Cunliffe)
- Technologies, Potential, and Implications of Additive Manufacturing (PDF) — AM is a group of emerging technologies that create objects from the bottom-up by adding material one cross-sectional layer at a time. [...] Ultimately, AM has the potential to be as disruptive as the personal computer and the internet. The digitization of physical artifacts allows for global sharing and distribution of designed solutions. It enables crowd-sourced design (and individual fabrication) of physical hardware. It lowers the barriers to manufacturing, and allows everyone to become an entrepreneur. (via Bruce Sterling)
Waning Interest, Infrastructure Changes, eBook Stats, and Retro Chic Peripherals
- Comparing Link Attention (Bitly) — Twitter, Facebook, and direct (email/IM/etc) have remarkably similar patterns of decay of interest. (via Hilary Mason)
- Three Ages of Google — from batch, to scaling through datacenters, and finally now to techniques for real-time scaling. Of interest to everyone interested in low-latency high-throughput transactions. Datacenters have the diameter of a microsecond, yet we are still using entire stacks designed for WANs. Real-time requires low and bounded latencies and our stacks can’t provide low latency at scale. We need to fix this problem and towards this end Luiz sets out a research agenda, targeting problems that need to be solved. (via Tim O’Reilly)
- eReaders and eBooks (Luke Wroblewski) — many eye-opening facts. In 2010 Amazon sold 115 Kindle books for every 100 paperback books. 65% of eReader owners use them in bed, in fact 37% of device usage is in bed.
- VT220 on a Mac — dead sexy look. Impressive how many adapters you need to be able to hook a dingy old serial cable up to your shiny new computer.
- Why Restaurant Web Sites Are So Bad — The rest of the Web long ago did away with auto-playing music, Flash buttons and menus, and elaborate intro pages, but restaurant sites seem stuck in 1999.
- North Korean Government Partly Funded by Gold Farming (Gamasutra) — alleges a special group of hackers built automation software for MMOs and sent part of their profits back home.
- Pleasanton Protects Bicyclists with Microwave (Mercury News) — no, not by pre-emptive cooking. The device monitors the intersection and can differentiate between vehicles and bicyclists crossing the road and either extends or triggers the light if a cyclist is detected.
Vendors jockey for Hadoop positioning, Facebook visualizes PHP modules, Shaq's stats
Competition among Hadoop vendors heats up, Facebook visualizes its PHP code modules, and a Many Eyes tool visualizes the stats from Shaquille O'Neal's basketball career.