- Reducing the Roots of Some Evil (Etsy) — Based on our first two months of data we have removed a number of unused CA certificates from some pilot systems to test the effects, and will run CAWatch for a full six months to build up a more comprehensive view of what CAs are in active use. Sign of how broken the CA system for SSL is. (via Alex Dong)
- Mind the Brain — PLOS podcast interviews Sci Foo alum and delicious neuroscience brain of awesome, Vaughan Bell. (via Fabiana Kubke)
- How Often are Ineffective Interventions Still Used in Practice? (PLOSone) — tl;dr: 8% of the time. Imagine the number if you asked how often ineffective software development practices are still used.
- Announcing Evan’s Awesome A/B Tools — I am calling these tools awesome because they are intuitive, visual, and easy-to-use. Unlike other online statistical calculators you’ve probably seen, they’ll help you understand what’s going on “under the hood” of common statistical tests, and by providing ample visual context, they make it easy for you to explain p-values and confidence intervals to your boss. (And they’re free!)
Arijit Sengupta of BeyondCore uncovers hidden relationships in public health data
The importance of visualizing data is universally recognized. But, usually the data is passive input to some visualization tool and the users have to specify the precise graph they want to visualize. BeyondCore simplifies this process by automatically evaluating millions of variable combinations to determine which graphs are the most interesting, and then highlights these to users. In essence, BeyondCore automatically tells us the right questions to ask of our data.
In this video, Arijit Sengupta, CEO of BeyondCore, describes how public health data can be analyzed in real-time to discover anomalies and other intriguing relationships, making them readily accessible even to viewers without a statistical background. Arijit will be speaking at Strata Rx 2013 with Tim Darling of Objective Health, a McKinsey Solution for Healthcare Providers, on the topic of this post.
Distrusting CA Certs, Brain Talk, Ineffective Interventions, and Visual A/B Tools
Better Crypto, NukeViz, Weed Economics, and Ethics of Prediction
- Applied Practical Cryptography — technical but readable article with lots of delicious lines. They’re a little magical, in the same sense that ABS brakes were magical in the 1970s and Cloud applications share metal with strangers, and thus attackers, who will gladly spend $40 to co-host themselves with a target and The conservative approach is again counterintuitive to developers, to whom hardcoding anything is like simony.
- Nukemap — interactive visualization of the fallout damage from a nuclear weapon. Now we can all be the scary 1970s “this is what it would look like if [big town] were nuked” documentaries that I remember growing up with. I love interactives for learning the contours of a problem, and making it real and personal in a way that a static visualization cannot. WIN. See also the creator’s writeup.
- Legalising Weed — Chuck, a dealer who switched from selling weed in California to New York and quadrupled his income, told WNYC, “There’s plenty of weed in New York. There’s just an illusion of scarcity, which is part of what I’m capitalizing on. Because this is a black market business, there’s insufficient information for customers.” Invisible economies are frequently inefficient, disrupted by moving online and made market-sense efficient.
- Can Software That Predicts Crime Pass Constitutional Muster? (NPR) — “I think most people are gonna defer to the black box,” he says. “Which means we need to focus on what’s going into that black box, how accurate it is, and what transparency and accountability measures we have [for] it.”
Better UIs, Dot Tricks, UAV Camera, and Writing Interactive Fiction
- Good UI — easily digested tips for improving UIs. (via BERG London)
- Mapping Millions of Dots — tips like The other thing that goes along with this brightness scaling is to draw fewer dots at lower zoom levels. By the time you get most of a continent on the screen, the dots are so much smaller than pixels and there are so many of them to draw, that it looks the same and is much faster if you draw half as many dots at twice the brightness apiece. (via Flowing Data)
- 118g 10x Zoom Camera for Drones — little less than 800×600 resolution. (via DIY Drones)
- Creating Interactive Fiction with Inform7 (Amazon) — all you need to write your own Zork, or even do better. With foreword by my hero (I squee like fanboy when I remember meeting him at the first Foo Camp) Don Woods. Yeah, Colossal Cave Adventure Don Woods. WIN. (via Marshall Tenner Winter)
Technical Bitcoin, Tracking News Flow, Science Advice, and Gov Web Sites
- 6 Technical Things I Learned About Bitcoin (Rusty Russell) — Anonymity is hard, but I was surprised to see blockchain.info’s page about my donation to Unfilter correctly geolocated to my home town! Perhaps it’s a fluke, but I was taken aback by how clear it was. Interesting collection of technical observations about the workings of Bitcoin.
- NIFTY: News Information Flow Tracking, Yay! — watch how news stories mutate and change over time. (via Stijn Debrouwere
- EO Wilson’s Advice for Future Scientists (NPR) — the ideal scientist thinks like a poet and works like a bookkeeper. (via Courtney Johnston)
- Healthcare.gov New Web Model for Government (The Atlantic) — The new site has been built in public for months, iteratively created on Github using cutting edge open-source technologies. Healthcare.gov is the rarest of birds: a next-generation website that also happens to be a .gov.
Quantum Programming, Quantum Again, Copyright Vanishes Media, and Email Metadata Analysis
- QCL: A Language for Quantum Computing — QCL is a high level, architecture independent programming language for quantum computers, with a syntax derived from classical procedural languages like C or Pascal. This allows for the complete implementation and simulation of quantum algorithms (including classical components) in one consistent formalism.. (Will not run on D-Wave, which is annealing rather a general purpose quantum computer)
- Quipper — a functional quantum programming language.
- How Copyright Makes Books Disappear — Amazon and YouTube data showing exponential growth in available content until copyright term is entered, at which point there’s a massive drop-off in availability. Graph is stunning. (via BoingBoing)
- Immersion — a people-centric view of your email life using only your metadata. Horrifyingly revealing.
Web Traffic Visualisation, TV Interviews, GPU Programming, and Programmatic Pants Design
- Web Traffic Visualization — Dots enter when transactions start and exit when completed. Their speed is proportional to client’s response time while their size reflects the server’s contribution to total time. Color comes from the specific request. (via Nelson Minar)
- Complete Guide to Being Interviewed on TV (Quartz) — good preparation for everyone who runs the risk of being quoted for 15 seconds.
- Harlan (GitHub) — new language for GPU programming. Simple examples in the announcement. (via Michael Bernstein)
- Open Fit — open source software that investigates several approaches to generating custom tailored pants patterns. Open Fit Lab is an attempt to use this software for on-the-spot generation and creation of custom clothes. (via Kaitlin Thaney)
Facebook Pub/Sub, Space/Time Visualization, Sean That Matters, and Keyboard Control
- Wormhole — Facebook’s pub/sub system. Wormhole propagates changes issued in one system to all systems that need to reflect those changes – within and across data centers.
- Nanocubes — Fast Visualization of Large Spatiotemporal Datasets.
- Sean Gourley on Relevance (YouTube) — Is Silicon Valley really doing what it should be doing? he asks, 3m30 in. Good to see him pondering stuff that matters, back in 2011.
- Shortcat — a keyboard tool for Mac OS X that lets you “click” buttons and control your apps with a few keystrokes. Think of it as Spotlight for the user interface.
Velocity 2013 Speaker Series
Be honest, have you ever wanted to play Steve Souders for a day and pull some revealing stats or trends about some web sites of your choice? Or maybe dig around the HTTP archive? You can do that and more by setting up your own HTTP Archive.
httparchive.org is a fantastic tool to track, monitor, and review how the web is built. You can dig into trends around page size, page load time, content delivery network (CDN) usage, distribution of different mimetypes, and many other stats. With the integration of WebPagetest, it’s a great tool for synthetic testing as well.
You can download an HTTP Archive MySQL dump (warning: it’s quite large) and the source code from the download page and dissect a snapshot of the data yourself. Once you’ve set up the database, you can easily query anything you want.
You need MySQL, PHP, and your own webserver running. As I mentioned above, HTTP Archive relies on WebPagetest—if you choose to run your own private instance of WebPagetest, you won’t have to request an API key. I decided to ask Patrick Meenan for an API key with limited query access. That was sufficient for me at the time. If I ever wanted to use more than 200 page loads per day, I would probably want to set up a private instance of WebPagetest.
To find more details on how to set up an HTTP Archive instance yourself and any further advice, please check out my blog post.
Going back to the scenario I described above: the real motivation is that often you don’t want to throw your website(s) in a pile of other websites (e.g. not related to your business) to compare or define trends. Our digital property at the Canadian Broadcasting Corporation’s (CBC) spans over dozens of URLs that have different purposes and audiences. For example, CBC Radio covers most of the Canadian radio landscape, CBC News offers the latest breaking news, CBC Hockey Night in Canada offers great insights on anything related to hockey, and CBC Video is the home for any video available on CBC. It’s valuable for us to not only compare cbc.ca to the top 100K Alexa sites but also to verify stats and data against our own pool of web sites.
In this case, we want to use a set of predefined URLs that we can collect HTTP Archive stats for. Hence a private instance can come in handy—we can run tests every day, or every week, or just every month to gather information about the performance of the sites we’ve selected. From there, it’s easy to not only compare trends from httparchive.org to our own instance as a performance baseline, but also have a great amount of data in our local database to run queries against and to do proper performance monitoring and investigation.
The beautiful thing about having your own instance is that you can be your own master of data visualization: you can now create more charts in addition to the ones that came out of the box with the default HTTP Archive setup. And if you don’t like Google chart tools, you may even want to check out D3.js or Highcharts instead.
The image below shows all mime types used by CBC web properties that are captured in our HTTP archive database, using D3.js bubble charts for visualization.
3D Visualization, Printing On Any Surface, Rebuilding Reality, and Emotions as Data
- For Example — amazing discussion of 3D visualization techniques, full of examples using the D3.js library and bl.ocks.org example gist system. Gorgeous and informative.
- Anti-Gravity 3D Printer — uses strands to sculpt on any surface. (via Slashdot)
- How 3D Printing Will Rebuild Reality (BoingBoing) — But even though home 3D-printing has received substantial publicity of late, it is in the industrial sector where the technology will probably make its most significant near-term impact on the world both by manufacturing improved commercial products and by stimulating industry to develop next-generation fab methods and machines that could one day truly bring 3D-printing home to users in a real way.
- The Emotional Side of Big Data — Personal Democracy Forum 2013 talk by Sara Critchfield, on reframing emotion as data for decision-making. (via Quartz)