- Analyzing mbostock’s queue.js — beautiful walkthrough of a small library, showing the how and why of good coding.
- What Job Would You Hire a Textbook To Do? (Karl Fisch) — notes from a Discovery Education “Beyond the Textbook” event. The issues Karl highlights for textbooks (why digital, etc.) are there for all books as we create this new genre.
- Neutralizing Open Access (Glyn Moody) — the publishers appear to have captured the UK group implementing the UK’s open access policy. At every single step of the way, the RCUK policy has been weakened. From being the best and most progressive in the world, it’s now considerably weaker than policies already in action elsewhere in the world, and hardly represents an increment on their 2006 policy. What’s at stake? Opportunity to do science faster, to provide source access to research for the public, and to redirect back to research the millions of pounds spent on journal subscriptions.
- Turn the Raspberry Pi into a VPN Server (LinuxUser) — One possible scenario for wanting a cheap server that you can leave somewhere is if you have recently moved away from home and would like to be able to easily access all of the devices on the network at home, in a secure manner. This will enable you to send files directly to computers, diagnose problems and other useful things. You’ll also be leaving a powered USB hub connected to the Pi, so that you can tell someone to plug in their flash drive, hard drive etc and put files on it for them. This way, they can simply come and collect it later whenever the transfer has finished.
ENTRIES TAGGED "open source"
Master Coding, Rethinking Textbooks, Blocking Open Access, VPN from your Pi
Machine Learning Demos, iOS Debugging, Industrial Internet, and Deanonymity
- MLDemos — an open-source visualization tool for machine learning algorithms created to help studying and understanding how several algorithms function and how their parameters affect and modify the results in problems of classification, regression, clustering, dimensionality reduction, dynamical systems and reward maximization. (via Mark Alen)
- kiln (GitHub) — open source extensible on-device debugging framework for iOS apps.
- Industrial Internet — the O’Reilly report on the industrial Internet of things is out. Prasad suggests an illustration: for every car with a rain sensor today, there are more than 10 that don’t have one. Instead of an optical sensor that turns on windshield wipers when it sees water, imagine the human in the car as a sensor — probably somewhat more discerning than the optical sensor in knowing what wiper setting is appropriate. A car could broadcast its wiper setting, along with its location, to the cloud. “Now you’ve got what you might call a rain API — two machines talking, mediated by a human being,” says Prasad. It could alert other cars to the presence of rain, perhaps switching on headlights automatically or changing the assumptions that nearby cars make about road traction.
- Unique in the Crowd: The Privacy Bounds of Human Mobility (PDF, Nature) — We study fifteen months of human mobility data for one and a half million individuals and find that human mobility traces are highly unique. In fact, in a dataset where the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier’s antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. We coarsen the data spatially and temporally to find a formula for the uniqueness of human mobility traces given their resolution and the available outside information. This formula shows that the uniqueness of mobility traces decays approximately as the 1/10 power of their resolution. Hence, even coarse datasets provide little anonymity. These findings represent fundamental constraints to an individual’s privacy and have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals. As Edd observed, “You are a unique snowflake, after all.” (via Alasdair Allan)
Titan Improved, Security Tweeps, Probabilistic Programming, and 3D-Printable Optics
- Titan 0.3 Out — graph database now has full-text, geo, and numeric-range index backends.
- Mozilla Security Community Do a Reddit AMA — if you wanted a list of sharp web security people to follow on Twitter, you could do a lot worse than this.
- Probabilistic Programming and Bayesian Methods for Hackers (Github) — An introduction to Bayesian methods + probabilistic programming in data analysis with a computation/understanding-first, mathematics-second point of view. All in pure Python. See also Why Probabilistic Programming Matters and Trends to Watch: Logic and Probabilistic Programming. (via Mike Loukides and Renee DiRestra)
- Open Source 3D-Printable Optics Equipment (PLOSone) — This study demonstrates an open-source optical library, which significantly reduces the costs associated with much optical equipment, while also enabling relatively easily adapted customizable designs. The cost reductions in general are over 97%, with some components representing only 1% of the current commercial investment for optical products of similar function. The results of this study make its clear that this method of scientific hardware development enables a much broader audience to participate in optical experimentation both as research and teaching platforms than previous proprietary methods.
New regulations could mark the end of proprietary finance.
Social Science, YAKVS, Open Source Mail, and Tesla Coil and Quadrocopter Fun
- The Effect of Group Attachment and Social Position on Prosocial Behavior (PLoSone) — notable, in my mind, for We conducted lab-in-the-field experiments involving 2,597 members of producer organizations in rural Uganda. cf the recently reported “rich are more selfish than poor” findings, which (like a lot of behavioural economics research) studies Berkeley undergrads who weren’t smart enough to figure out what was being studied.
- elephant — a HTTP key/value store with full-text search and fast queries. Still a work in progress.
- geary (IndieGoGo) — a beautiful modern open-source email client. Found this roughly the same time as elasticinbox open source, reliable, distributed, scalable email store. Open source email action starting?
- The Faraday Copter (YouTube) — Tesla coil and quadrocopter madness. (via Jeff Jonas)
Patenting Preventing Placebos, Simulating Malaria, Pricing Experiments, and Mining Bitcoin
- Patent on Medical Trial Design to Reduce Placebo Effect — drug companies say these failures are happening not because their drugs are ineffective, but because placebos have recently become more effective in clinical trials. [...] The whole idea that placebo effect is getting in the way of producing meaningful results is repugnant, I think, to anyone with scientific training. What’s even more repugnant, however, is that Fava’s group didn’t stop with a mere paper in Psychotherapy and Psychosomatics. They went on to apply for, and obtain, U.S. patents on SPCD. (via Ben Goldacre)
- OpenMalaria (Google Code) — an open source C++ program for simulating malaria epidemiology and the impacts on that epidemiology of interventions against malaria. It is based on microsimulations of Plasmodium falciparum malaria in humans, originally developed for simulating malaria vaccines. (via Victoria Stodden)
- Pricing Experiments You Might Not Know But Can Learn From — compendium of ideas and experiments for pricing.
- Retrominer — mining Bitcoins on a NES. I’m delighted by the conceit, and noticing that Bitcoin is now sufficiently part of the zeitgeist as to feature in playful hacks.
- Digital Music Consumption on the Internet: Evidence from Clickstream Data (Scribd) — The goal of this paper is to analyze the behavior of digital music consumers on the Internet. Using clickstream data on a panel of more than 16,000 European consumers, we estimate the effects of illegal downloading and legal streaming on the legal purchases of digital music. Our results suggest that Internet users do not view illegal downloading as a substitute to legal digital music. Although positive and signiﬁcant, our estimated elasticities are essentially zero: a 10% increase in clicks on illegal downloading websites leads to a 0.2% increase in clicks on legal purchases websites. Online music streaming services are found to have a somewhat larger (but still small) effect on the purchases of digital sound recordings, suggesting complementarities between these two modes of music consumption. According to our results, a 10% increase in clicks on legal streaming websites lead to up to a 0.7% increase in clicks on legal digital purchases websites. We ﬁnd important cross country difference in these eﬀects. A paper from the EU commission’s in-house science service. (via Don Christie)
- Six Degrees of Francis Bacon — data-driven research into “the early-modern social network”. (via Jonathan Gray)
- Internet Census 2012 — scanning the net via botnet. Appalling how many unsecured devices are directly connected to the net. Also appalling how underused the address space is.
Chicago CIO Brett Goldstein is experimenting with social coding for a different kind of civic engagement.
Visualizing City Data, Gigabits Unrealized, Use Open Source, and Bad IPs Cluster
- VizCities Dev Diary — step-by-step recount of how they brought London’s data to life, SimCity-style.
- Google Fibre Isn’t That Impressive — For [gigabit broadband] to become truly useful and necessary, we’ll need to see a long-term feedback loop of utility and acceptance. First, super-fast lines must allow us to do things that we can’t do with the pedestrian internet. This will prompt more people to demand gigabit lines, which will in turn invite developers to create more apps that require high speed, and so on. What I discovered in Kansas City is that this cycle has not yet begun. Or, as Ars Technica put it recently, “The rest of the internet is too slow for Google Fibre.”
- gov.uk Recommendations on Open Source — Use open source software in preference to proprietary or closed source alternatives, in particular for operating systems, networking software, Web servers, databases and programming languages.
- Internet Bad Neighbourhoods (PDF) — bilingual PhD thesis. The idea behind the Internet Bad Neighborhood concept is that the probability of a host in behaving badly increases if its neighboring hosts (i.e., hosts within the same subnetwork) also behave badly. This idea, in turn, can be exploited to improve current Internet security solutions, since it provides an indirect approach to predict new sources of attacks (neighboring hosts of malicious ones).
- A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method (PDF) — This project was simultaneously an experiment in developing quantitative and computational methods for tracing changes in literary language. We wanted to see how far quantifiable features such as word usage could be pushed toward the investigation of literary history. Could we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? To this end, we present a second set of results, the techniques and methodological lessons gained in the course of designing and running this project. Even litcrit becoming a data game.
- Easy6502 — get started writing 6502 assembly language. Fun way to get started with low-level coding.
- How Analytics Really Work at a Small Startup (Pete Warden) — The key for us is that we’re using the information we get primarily for decision-making (should we build out feature X?) rather than optimization (how can we improve feature X?). Nice rundown of tools and systems he uses, with plug for KissMetrics.