Nat Torkington

Nat has chaired the O'Reilly Open Source Convention and other O'Reilly conferences for over a decade. He ran the first web server in New Zealand, co-wrote the best-selling Perl Cookbook, and was one of the founding Radar bloggers. He lives in New Zealand and consults in the Asia-Pacific region.

Four short links: 21 March 2013

Four short links: 21 March 2013

Obfuscation, Logging, Copyright, and Control

  1. The Obfuscation of CultureTumblr and LJ users sep ar ate w ords thr ou gh o dd spacin g in o rde r to fo ol sea rc h en g i nes. Chinese users hide political messages in image attachments to seemingly benign posts on Weibo. General Pretraeus communicated solely through draft mode. 4chan scares away the faint of heart with porn. More technically astute groups communicate through obscure messaging systems. (via Beta Knowledge)
  2. log2vizan open-source demonstration of the logs-as-data concept for Heroku apps. Log in and select one of your apps to see a live-updating dashboard of its web activity.
  3. Doctorow at LoC (YouTube) — video of Cory Doctorow’s talk on ebooks, libraries, and copyright at the Library of Congress.
  4. When TED Lost Control of its Crowd (HBR) — golden case study. You can’t “manage” a crowd—or a community—through transactional exchanges or economic incentives. You need something stronger: shared purpose
Comment |
Four short links: 20 March 2013

Four short links: 20 March 2013

"Piracy" Good for Sales, Digital Humanities, Javascript Source Formatting, and Research by BotNet

  1. Digital Music Consumption on the Internet: Evidence from Clickstream Data (Scribd) — The goal of this paper is to analyze the behavior of digital music consumers on the Internet. Using clickstream data on a panel of more than 16,000 European consumers, we estimate the effects of illegal downloading and legal streaming on the legal purchases of digital music. Our results suggest that Internet users do not view illegal downloading as a substitute to legal digital music. Although positive and significant, our estimated elasticities are essentially zero: a 10% increase in clicks on illegal downloading websites leads to a 0.2% increase in clicks on legal purchases websites. Online music streaming services are found to have a somewhat larger (but still small) effect on the purchases of digital sound recordings, suggesting complementarities between these two modes of music consumption. According to our results, a 10% increase in clicks on legal streaming websites lead to up to a 0.7% increase in clicks on legal digital purchases websites. We find important cross country difference in these effects. A paper from the EU commission’s in-house science service. (via Don Christie)
  2. Six Degrees of Francis Bacon — data-driven research into “the early-modern social network”. (via Jonathan Gray)
  3. jsshaperan extensible framework for JavaScript syntax tree shaping. Super-powerful source code reformatter & more for Javascript.
  4. Internet Census 2012 — scanning the net via botnet. Appalling how many unsecured devices are directly connected to the net. Also appalling how underused the address space is.
Comment |
Four short links: 19 March 2013

Four short links: 19 March 2013

Visualizing City Data, Gigabits Unrealized, Use Open Source, and Bad IPs Cluster

  1. VizCities Dev Diary — step-by-step recount of how they brought London’s data to life, SimCity-style.
  2. Google Fibre Isn’t That ImpressiveFor [gigabit broadband] to become truly useful and necessary, we’ll need to see a long-term feedback loop of utility and acceptance. First, super-fast lines must allow us to do things that we can’t do with the pedestrian internet. This will prompt more people to demand gigabit lines, which will in turn invite developers to create more apps that require high speed, and so on. What I discovered in Kansas City is that this cycle has not yet begun. Or, as Ars Technica put it recently, “The rest of the internet is too slow for Google Fibre.”
  3. gov.uk Recommendations on Open SourceUse open source software in preference to proprietary or closed source alternatives, in particular for operating systems, networking software, Web servers, databases and programming languages.
  4. Internet Bad Neighbourhoods (PDF) — bilingual PhD thesis. The idea behind the Internet Bad Neighborhood concept is that the probability of a host in behaving badly increases if its neighboring hosts (i.e., hosts within the same subnetwork) also behave badly. This idea, in turn, can be exploited to improve current Internet security solutions, since it provides an indirect approach to predict new sources of attacks (neighboring hosts of malicious ones).
Comment: 1 |
Four short links: 18 March 2013

Four short links: 18 March 2013

Big Lit Data, 6502 Assembly, Small Startup Analytics, and Javascript Heatmaps

  1. A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method (PDF) — This project was simultaneously an experiment in developing quantitative and computational methods for tracing changes in literary language. We wanted to see how far quantifiable features such as word usage could be pushed toward the investigation of literary history. Could we leverage quantitative methods in ways that respect the nuance and complexity we value in the humanities? To this end, we present a second set of results, the techniques and methodological lessons gained in the course of designing and running this project. Even litcrit becoming a data game.
  2. Easy6502get started writing 6502 assembly language. Fun way to get started with low-level coding.
  3. How Analytics Really Work at a Small Startup (Pete Warden) — The key for us is that we’re using the information we get primarily for decision-making (should we build out feature X?) rather than optimization (how can we improve feature X?). Nice rundown of tools and systems he uses, with plug for KissMetrics.
  4. webgl-heatmap (GitHub) — a JavaScript library for high performance heatmap display.
Comment |
Four short links: 15 March 2013

Four short links: 15 March 2013

Search Ads Meh, Hacked Website Help, Web Design Sins, and Lazy Correlations

  1. Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment (PDF) — We find that new and infrequent users are positively influenced by ads but that existing loyal users whose purchasing behavior is not influenced by paid search account for most of the advertising expenses, resulting in average returns that are negative. We discuss substitution to other channels and implications for advertising decisions in large firms. eBay-commissioned research, so salt to taste. (via Guardian)
  2. Google’s Help for Hacked Webmasters — what it says.
  3. 14 Lousy Web Design Trends Making a Comeback Thanks to HTML 5 — “mystery meat icons” a pet bugbear of mine.
  4. The Human Microbiome 101 (SlideShare) — SciFoo alum Jonathan Eisen’s talk. Informative, but super-notable for “complexity is astonishing, massive risk for false positive associations”. Remember this the next time your Big Data Scientist (aka kid with R) tells you one surprising variable predicts 66% of anything. I wish I had the audio from this talk!
Comments: 3 |
Four short links: 14 March 2013

Four short links: 14 March 2013

On Anonymous, Information Rights, RSS Readers, and CDN Sec

  1. Our Weirdness is Free (Gabriella Coleman) — Often lacking an overarching strategy, Anonymous operates tactically, along the lines proposed by the French Jesuit thinker Michel de Certeau. “Because it does not have a place, a tactic depends on time—it is always on the watch for opportunities that must be seized ‘on the wing,’” he writes in The Practice of Everyday Life (1980). “Whatever it wins, it does not keep. It must constantly manipulate events in order to turn them into ‘opportunities.’ The weak must continually turn to their own ends forces alien to them.” (via Jonas Kubilius)
  2. Information Rights and Copy Rights (YouTube) — Justice David Harvey’s keynote at Australian Digital Alliance forum, proposing balance of rights. (via Alastair Thompson)
  3. NewsBlur (GitHub) — one of the many trending repos in the wake of the announcement of Google Reader’s case of terminal lack of relevance to Google+. See also Tiny Tiny RSS, FastLadder, and a million repos empty but for “TODO” files listing the almighty RSS reading features yet to be added to the empty file. Also found: this obsessive guide to Reader’s history.
  4. The Pentester’s Guide to Akamai (PDF) — This paper summarizes the findings from NCC’s research into Akamai while providing advice to
    companies wish to gain the maximum security when leveraging their solutions.
Comment |
Four short links: 13 March 2013

Four short links: 13 March 2013

HTML DRM, Visualizing Medical Sciences, Lifelong Learning, and Hardware Hackery

  1. What Tim Berners-Lee Doesn’t Know About HTML DRM (Guardian) — Cory Doctorow lays it out straight. HTML DRM is a bad idea, no two ways. The future of the Web is the future of the world, because everything we do today involves the net and everything we’ll do tomorrow will require it. Now it proposes to sell out that trust, on the grounds that Big Content will lock up its “content” in Flash if it doesn’t get a veto over Web-innovation. [...] The W3C has a duty to send the DRM-peddlers packing, just as the US courts did in the case of digital TV.
  2. Visualizing the Topical Structure of the Medical Sciences: A Self-Organizing Map Approach (PLOSone) — a high-resolution visualization of the medical knowledge domain using the self-organizing map (SOM) method, based on a corpus of over two million publications.
  3. What Teens Get About The Internet That Parents Don’t (The Atlantic) — the Internet has been a lifeline for self-directed learning and connection to peers. In our research, we found that parents more often than not have a negative view of the role of the Internet in learning, but young people almost always have a positive one. (via Clive Thompson)
  4. Portable C64 — beautiful piece of C64 hardware hacking to embed a screen and battery in it. (via Hackaday)
Comment |
Four short links: 12 March 2013

Four short links: 12 March 2013

Chrome Tricks, Sins of Journaling, Icon Font, and Sweet PD

  1. One Tab — turn tabs into lists, easily. (via Andy Baio)
  2. Deep Impact: Unintended Consequences of Journal RankThese data confirm previous suspicions: using journal rank as an assessment tool is bad scientific practice. Moreover, the data lead us to argue that any journal rank (not only the currently-favored Impact Factor) would have this negative impact. Therefore, we suggest that abandoning journals altogether.
  3. Genericons — useful straightforward icon font.
  4. Public Domain Review FundraisingOver the course of our two years we’ve created a large and ever growing archive of some of the most interesting and unusual artefacts in the history of art, literature and ideas. Love the idea of some limited edition reprints of these gorgeous works!
Comment |
Four short links: 11 March 2013

Four short links: 11 March 2013

Ransom Money, High School CS, Wikipedia Links, and Social Teens

  1. Adventures in the Ransom Trade — between insurance, protection, and ransoms, Sean Gourley describes it as “one of the more interesting grey markets.” (via Sean Gourley)
  2. About High School Computer Science Teachers (Selena Deckelmann) — Selena gets an education in the state of high school computer science education.
  3. Learning From Big Data (Google Research) — the Wikilinks Corpus: 40 million total disambiguated mentions within over 10 million web pages [...] The mentions are found by looking for links to Wikipedia pages where the anchor text of the link closely matches the title of the target Wikipedia page. If we think of each page on Wikipedia as an entity (an idea we’ve discussed before), then the anchor text can be thought of as a mention of the corresponding entity.
  4. Teens Have Always Gone Where Identity Isn’tif you look back at one of the first dominant social platforms, AOL Instant Messenger, it looks a lot like the pseudonymous Tumblr and Snapchat of today in many respects. You used an avatar that was not your face. Your screenname was not indexed and not personally identifiable (mine was Goober1310).
Comment |
Four short links: 8 March 2013

Four short links: 8 March 2013

Comparing Algorithms, Programming & Visual Arts, Data Brokers, and Your Brain on Ebooks

  1. mlcompa free website for objectively comparing machine learning programs across various datasets for multiple problem domains.
  2. Printing Code: Programming and the Visual Arts (Vimeo) — Rune Madsen’s talk from Heroku’s Waza. (via Andrew Odewahn)
  3. What Data Brokers Know About You (ProPublica) — excellent run-down on the compilers of big data about us. Where are they getting all this info? The stores where you shop sell it to them.
  4. Subjective Impressions Do Not Mirror Online Reading Effort: Concurrent EEG-Eyetracking Evidence from the Reading of Books and Digital Media (PLOSone) — Comprehension accuracy did not differ across the three media for either group and EEG and eye fixations were the same. Yet readers stated they preferred paper. That preference, the authors conclude, isn’t because it’s less readable. From this perspective, the subjective ratings of our participants (and those in previous studies) may be viewed as attitudes within a period of cultural change.
Comment |