- Britain To Provide Free Access to Scientific Publications (Guardian) — the Finch report is being implemented! British universities now pay around £200m a year in subscription fees to journal publishers, but under the new scheme, authors will pay “article processing charges” (APCs) to have their papers peer reviewed, edited and made freely available online. The typical APC is around £2,000 per article.
- Social Media in an Emergency: A Best Practice Guide — from the Wellington City Council in New Zealand, who have been learning from Christchurch earthquakes and Tauranga’s oil spill.
- Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained (PDF) — Microsoft Research dug into A/B tests done on Bing and reveal some subtle truths. The statistical theory of controlled experiments is well understood, but the devil is in the details and the difference between theory and practice is greater in practice than in theory [...] Generating numbers is easy; generating numbers you should trust is hard! (via Greg Linden)
- Data Sequencing Costs (National Human Genome Research Institute) — Cost-per-megabase and cost-per-genome are dropping faster than Moore’s Law now they’ve introduced “second generation techniques” for sequencing, aka “high-throughput sequencing” or a parallelization of the process. (via JP Rangaswami)
ENTRIES TAGGED "analytics"
Inside Anonymous, Kanban Board, Extending Objective C, and Football Graphs
- How Anonymous Works (Wired) — Quinn Norton explains how the decentralized Anonymous operates, and how the transition to political activism happened. Required reading to understand post-state post-structure organisations, and to make sense of this chaotic unpredictable entity.
- Kanban For 1 — very nice progress board for tasks, for the lifehackers who want to apply agile software tools to the rest of their life.
- libextobj (GitHub) — library of extensions to Objective C to support patterns from other languages. (via Ian Kallen)
- Graph Theory to Understood Football (Tech Review) — players are nodes, passes build edges, and you can see strengths and strategies of teams in the resulting graphs.
Predictive Policing, Public Sector Tech Benefits, Wireless Joystick on a Ring, and Recruiter Honeypot
- Predicting Crime Before It Occurs (SFGate) — The new program used by LAPD and police in the Northern California city of Santa Cruz is more timely and precise, proponents said. Built on the same model for predicting aftershocks following an earthquake, the software promises to show officers what might be coming based on simple, constantly calibrated data — location, time and type of crime. The software generates prediction boxes — as small as 500 square feet — on a patrol map. When officers have spare time, they are told to “go in the box.”
- Realising Benefits From Six Public Sector Technology Projects (PDF) — New Zealand report from the Auditor-General. Conclusion specifically calls out agile development, open source, and open data as technology tools that helped deliver success.
- Ringbow (Kickstarter) — a D-pad style joystick controller, built into a ring and designed for use with touchscreen games.
- The Recruiter Honeypot (Elaine Wherry) — Brilliant! Trying to ramp up Meebo’s staff, Elaine created a fake employee profile to see where recruiters hunted and to identify the best. Her lessons are great advice for anyone also trying to hire up fast in the Bay Area. Worth reading if only for the squicky stories of sleazy recruiters.
The future of desktops, ethics and big data, narrative vs spreadsheets.
This week on O'Reilly: Josh Marinacci predicted that 90% of computer users will rely on mobile, but 10% will still need desktops; the authors of "Ethics of Big Data" explored data's trickiest issues; and Narrative Science CTO Kris Hammond discussed narrative's role in data analytics.
Doug Cutting on Hadoop, Max Gadney on video data graphics, Jeremy Howard on big data and analytics.
Hadoop creator Doug Cutting discussing the similarities between Linux and the big data world, Max Gadney from After the Flood explains the benefits of video data graphics, Kaggle's Jeremy Howard looks at the difference between big data and analytics.
Inside Personalized Advertising, Printing Presses Were Good For The Economy, Digital Access, and Ebooks in Libraries
- Web-Scale User Modeling for Targeting (Yahoo! Research, PDF) — research paper that shows how online advertisers build profiles of us and what matters (e.g., ads we buy from are more important than those we simply click on). Our recent surfing patterns are more relevant than historical ones, which is another indication that value of data analytics increases the closer to real-time it happens. (via Greg Linden)
- Information Technology and Economic Change — research showing that cities which adopted the printing press no prior growth advantage, but subsequently grew far faster than similar cities without printing presses. [...] The second factor behind the localisation of spillovers is intriguing given contemporary questions about the impact of information technology. The printing press made it cheaper to transmit ideas over distance, but it also fostered important face-to-face interactions. The printer’s workshop brought scholars, merchants, craftsmen, and mechanics together for the first time in a commercial environment, eroding a pre-existing “town and gown” divide.
- They Just Don’t Get It (Cameron Neylon) — curating access to a digital collection does not scale.
- Should Libraries Get Out of the Ebook Business? — provocative thought: the ebook industry is nascent, a small number of patrons have ereaders, the technical pain of DRM and incompatible formats makes for disproportionate support costs, and there are already plenty of worthy things libraries should be doing. I only wonder how quickly the dynamics change: a minority may have dedicated ereaders but a large number have smartphones and are reading on them already.
The work of data journalists and a comparison of four data markets.
This week's data news includes a look at the work of various data journalists, Edd Dumbill surveys four data marketplaces, and the MIT Sloan Sports Analytics Conference experiences impressive growth.
Stuff That Matters, Web Waste, Learning Analytics, and Thoughtful Quotes
- SoupHub — NZ project putting a computer with Internet access (and instruction and help) into a soup kitchen. I can’t take any credit for it, but I’m delighted beyond measure that the idea for this was hatched at Kiwi Foo Camp. I love that my peeps are doing stuff that matters. (See also the newspaper writeup)
- Bandwidth of Pages — view a 140 character tweet on the web and you’re load 2MB of, well, let’s call it crap.
- On The Reductionism of Analytics in Education (Anne Zelenka) — Learning analytics, as practiced today, is reductionist to an extreme. We are reducing too many dimensions into too few. More than that, we are describing and analyzing only those things that we can describe and analyze, when what matters exists at a totally different level and complexity. We are missing emergent properties of educational and learning processes by focusing on the few things we can measure and by trying to automate what decisions and actions might be automated. A fantastic post, which coins the phrase “the math is not the territory”.
- Quotes Worth Spreading (Karl Fisch) — collection of thought-provoking quotes from recent TED talks. Be generous by graciously accepting compliments. It’s a gift you give the complimenter (John Bates) is something I’m particularly working on.
Analytics in Excel, HTTP Debugger, Analytics for Personalized Healthcare, and EFF To The Rescue
- Excel Cloud Data Analytics (Microsoft Research) — clever–a cloud analytics backend with Excel as the frontend. Almost every business and finance person I’ve known has been way more comfortable with Excel than any other tool. (via Dr Data)
- HTTP Client — Mac OS X app for inspecting and automating a lot of HTTP. cf the lovely Charles proxy for debugging. (via Nelson Minar)
- The Creative Destruction of Medicine — using big data, gadgets, and sweet tech in general to personalize and improve healthcare. (via New York Times)
- EFF Wins Protection of Time Zone Database (EFF) — I posted about the silliness before (maintainers of the only comprehensive database of time zones was being threatened by astrologers). The EFF stepped in, beat back the buffoons, and now we’re back to being responsible when we screw up timezones for phone calls.