- Digital Music Consumption on the Internet: Evidence from Clickstream Data (Scribd) — The goal of this paper is to analyze the behavior of digital music consumers on the Internet. Using clickstream data on a panel of more than 16,000 European consumers, we estimate the effects of illegal downloading and legal streaming on the legal purchases of digital music. Our results suggest that Internet users do not view illegal downloading as a substitute to legal digital music. Although positive and signiﬁcant, our estimated elasticities are essentially zero: a 10% increase in clicks on illegal downloading websites leads to a 0.2% increase in clicks on legal purchases websites. Online music streaming services are found to have a somewhat larger (but still small) effect on the purchases of digital sound recordings, suggesting complementarities between these two modes of music consumption. According to our results, a 10% increase in clicks on legal streaming websites lead to up to a 0.7% increase in clicks on legal digital purchases websites. We ﬁnd important cross country difference in these eﬀects. A paper from the EU commission’s in-house science service. (via Don Christie)
- Six Degrees of Francis Bacon — data-driven research into “the early-modern social network”. (via Jonathan Gray)
- Internet Census 2012 — scanning the net via botnet. Appalling how many unsecured devices are directly connected to the net. Also appalling how underused the address space is.
ENTRIES TAGGED "programming"
Being both liberal and safe in programming is hard
Visualizing City Data, Gigabits Unrealized, Use Open Source, and Bad IPs Cluster
- VizCities Dev Diary — step-by-step recount of how they brought London’s data to life, SimCity-style.
- Google Fibre Isn’t That Impressive — For [gigabit broadband] to become truly useful and necessary, we’ll need to see a long-term feedback loop of utility and acceptance. First, super-fast lines must allow us to do things that we can’t do with the pedestrian internet. This will prompt more people to demand gigabit lines, which will in turn invite developers to create more apps that require high speed, and so on. What I discovered in Kansas City is that this cycle has not yet begun. Or, as Ars Technica put it recently, “The rest of the internet is too slow for Google Fibre.”
- gov.uk Recommendations on Open Source — Use open source software in preference to proprietary or closed source alternatives, in particular for operating systems, networking software, Web servers, databases and programming languages.
- Internet Bad Neighbourhoods (PDF) — bilingual PhD thesis. The idea behind the Internet Bad Neighborhood concept is that the probability of a host in behaving badly increases if its neighboring hosts (i.e., hosts within the same subnetwork) also behave badly. This idea, in turn, can be exploited to improve current Internet security solutions, since it provides an indirect approach to predict new sources of attacks (neighboring hosts of malicious ones).
- A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method (PDF) — This project was simultaneously an experiment in developing quantitative and computational methods for tracing changes in literary language. We wanted to see how far quantifiable features such as word usage could be pushed toward the investigation of literary history. Could we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? To this end, we present a second set of results, the techniques and methodological lessons gained in the course of designing and running this project. Even litcrit becoming a data game.
- Easy6502 — get started writing 6502 assembly language. Fun way to get started with low-level coding.
- How Analytics Really Work at a Small Startup (Pete Warden) — The key for us is that we’re using the information we get primarily for decision-making (should we build out feature X?) rather than optimization (how can we improve feature X?). Nice rundown of tools and systems he uses, with plug for KissMetrics.
Comparing Algorithms, Programming & Visual Arts, Data Brokers, and Your Brain on Ebooks
- mlcomp — a free website for objectively comparing machine learning programs across various datasets for multiple problem domains.
- Printing Code: Programming and the Visual Arts (Vimeo) — Rune Madsen’s talk from Heroku’s Waza. (via Andrew Odewahn)
- What Data Brokers Know About You (ProPublica) — excellent run-down on the compilers of big data about us. Where are they getting all this info? The stores where you shop sell it to them.
- Subjective Impressions Do Not Mirror Online Reading Effort: Concurrent EEG-Eyetracking Evidence from the Reading of Books and Digital Media (PLOSone) — Comprehension accuracy did not differ across the three media for either group and EEG and eye fixations were the same. Yet readers stated they preferred paper. That preference, the authors conclude, isn’t because it’s less readable. From this perspective, the subjective ratings of our participants (and those in previous studies) may be viewed as attitudes within a period of cultural change.
Video Magnification Code, Copyright MOOC, Open Access Cost-Effectiveness, and SCADA Security (Sucks)
- Eulerian Video Magnification — papers and the MatLab source code for that amazing effect of exaggerating small changes in file. (*This work is patent pending)
- CopyrightX — MOOC on current law of copyright and the ongoing debates concerning how that law should be reformed. Through a combination of pre-recorded lectures, live webcasts, and weekly online seminars, participants in the course will examine and assess the ways in which law seeks to stimulate and regulate creative expression. (via BoingBoing)
- Cost Effectiveness for Open Access Journals — This plot reveals the prestige (Article Influence score) and publication charges for open access journals.
- Results of SANS SCADA Survey 2013 (PDF) — Unfortunately, at this time they seem unable to monitor the PLCs, terminal units and connections to field equipment due to lack of native security in the control systems themselves. (via InfoSecIsland)
Open Source Cancer Informatics, NPR Framework, Littery Junk, BitTorrent Sync
- Open Source Cancer Informatics Software (NCIP) — we have tackled the main recommendation that came out of our June meeting with open-source thought leaders: Keep it simple. Make barriers to entry as low as possible, and reuse available resources. Specifically, we have adopted a software license that is approved by the Open Source Initiative (OSI) and have begun to migrate the code developed under the cancer Biomedical Informatics Grid® (caBIG®) Program to a public repository. Our goal in taking these steps is to remove as many barriers as possible to community participation in the continuing development of these assets. Awesome! (via John Scott)
- NPR’s Framework for Easy Apps — their three architectural maxims: Servers are for chumps; If it doesn’t work on mobile, it doesn’t work; and Build for use. Refactor for reuse..
- Random Junk in People’s Labs (Reddit) — reminded me of the contents of my “tmp” and “Downloads” and “Documents” directories: unstructured historical crap with no expiration and no current use. (Caution: swearing in the title of the Reddit post) (via Mihalyi Csikszentmihalyi)
- Sync — BitTorrent’s alpha-level tech to “automatically sync files between computers via secure, distributed technology.” Not only is it “slick for alpha” (as one friend described), it’s bloody useful: I know at least one multimillion-dollar project built on their own homegrown implementation of this same idea. (via Jason Ryan)