- Apple Transparency Report (PDF) — contains a warrant canary, the statement Apple has never received an order under Section 215 of the USA Patriot Act. We would expect to challenge an order if served on us which will of course be removed if one of the secret orders is received. Bravo, Apple, for implementing a clever hack to route around excessive secrecy. (via Boing Boing)
- You’re Probably Polluting Your Statistics More Than You Think — it is insanely easy to find phantom correlations in random data without obviously being foolish. Anyone who thinks it’s possible to draw truthful conclusions from data analysis without really learning statistics needs to read this. (via Stijn Debrouwere)
- CyPhy Funded (Quartz) — the second act of iRobot co-founder Helen Greiner, maker of the famed Roomba robot vacuum cleaner. She terrified ETech long ago—the audience were expecting Roomba cuteness and got a keynote about military deathbots. It would appear she’s still in the deathbot niche, not so much with the cute. Remember this when you build your OpenCV-powered recoil-resistant load-bearing-hoverbot and think it’ll only ever be used for the intended purpose of launching fertiliser pellets into third world hemp farms.
- User-Agent String History — a light-hearted illustration of why the formal semantic value of free-text fields is driven to zero in the face of actual use.
ENTRIES TAGGED "data"
Warrant Canary, Polluted Statistics, Dollars for Deathbots, and Protocol Madness
Time Series Database, Cluster Schedulers, Structural Search-and-Replace, and TV Data
- Influx DB — open-source, distributed, time series, events, and metrics database with no external dependencies.
- Omega (PDF) — ﬂexible, scalable schedulers for large compute clusters. From Google Research.
- Amazon Mines Its Data Trove To Bet on TV’s Next Hit (WSJ) — Amazon produced about 20 pages of data detailing, among other things, how much a pilot was viewed, how many users gave it a 5-star rating and how many shared it with friends.
The Internot of Things, Explainy Learning, Medical Microcontroller Board, and Coder Sutra
- A Cyber Attack Against Israel Shut Down a Road — The hackers targeted the Tunnels’ camera system which put the roadway into an immediate lockdown mode, shutting it down for twenty minutes. The next day the attackers managed to break in for even longer during the heavy morning rush hour, shutting the entire system for eight hours. Because all that is digital melts into code, and code is an unsolved problem.
- Random Decision Forests (PDF) — “Due to the nature of the algorithm, most Random Decision Forest implementations provide an extraordinary amount of information about the final state of the classifier and how it derived from the training data.” (via Greg Borenstein)
- BITalino — 149 Euro microcontroller board full of physiological sensors: muscles, skin conductivity, light, acceleration, and heartbeat. A platform for healthcare hardware hacking?
- How to Be a Programmer — a braindump from a guru.
Disk Over Ethernet, Inside Elite, Polar Charts, and R Videos
- Seagate Kinetic Storage — In the words of Geoff Arnold: The physical interconnect to the disk drive is now Ethernet. The interface is a simple key-value object oriented access scheme, implemented using Google Protocol Buffers. It supports key-based CRUD (create, read, update and delete); it also implements third-party transfers (“transfer the objects with keys X, Y and Z to the drive with IP address 22.214.171.124”). Configuration is based on DHCP, and everything can be authenticated and encrypted. The system supports a variety of key schemas to make it easy for various storage services to shard the data across multiple drives.
- Masters of Their Universe (Guardian) — well-written and fascinating story of the creation of the Elite game (one founder of which went on to make the Raspberry Pi). The classic action game of the early 1980s – Defender, Pac Man – was set in a perpetual present tense, a sort of arcade Eden in which there were always enemies to zap or gobble, but nothing ever changed apart from the score. By letting the player tool up with better guns, Bell and Braben were introducing a whole new dimension, the dimension of time.
- Micropolar (github) — A tiny polar charts library made with D3.js.
- Introduction to R (YouTube) — 21 short videos from Google.
Visual Arduino Coding, Hardware Iteration, Segmenting Images, and Client-Side Adjustable Data View
- Visually Programming Arduino — good for little minds.
- Rapid Hardware Iteration at Scale (Forbes) — It’s part of the unique way that Xiaomi operates, closely analyzing the user feedback it gets on its smartphones and following the suggestions it likes for the next batch of 100,000 phones. It releases them every Tuesday at noon Beijing time.
- Machine Learning of Hierarchical Clustering to Segment 2D and 3D Images (PLoS One) — We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets.
- Kratu — an Open Source client-side analysis framework to create simple yet powerful renditions of data. It allows you to dynamically adjust your view of the data to highlight issues, opportunities and correlations in the data.
Rich Text Editing, Structural Visualisation, DDoS Protection, Realtime DDoS Map
- Sir Trevor — nice rich-text editing. Interesting how Markdown has become the way to store formatted text without storing HTML (and thus exposing the CSRF-inducing HTML-escaping stuckfastrophe).
- Slate for Excel — visualising spreadsheet structure. I’d be surprised if it took MSFT or Goog 30 days to acquire them.
- Project Shield — Google project to protect against DDoSes.
- Digital Attack Map — DDoS attacks going on around the world. (via Jim Stogdill)
Algorithmic Optimisation, 3D Scanners, Corporate Open Source, and Data Dives
- Unhappy Truckers and Other Algorithmic Problems — Even the insides of vans are subjected to a kind of routing algorithm; the next time you get a package, look for a three-letter letter code, like “RDL.” That means “rear door left,” and it is so the driver has to take as few steps as possible to locate the package. (via Sam Minnee)
- Fuel3D: A Sub-$1000 3D Scanner (Kickstarter) — a point-and-shoot 3D imaging system that captures extremely high resolution mesh and color information of objects. Fuel3D is the world’s first 3D scanner to combine pre-calibrated stereo cameras with photometric imaging to capture and process files in seconds.
- Corporate Open Source Anti-Patterns (YouTube) — Brian Cantrill’s talk, slides here. (via Daniel Bachhuber)
- Hacking for Humanity) (The Economist) — Getting PhDs and data specialists to donate their skills to charities is the idea behind the event’s organizer, DataKind UK, an offshoot of the American nonprofit group.
The cities appear to breathe as bicycles move into office districts in the morning and out in the evening.
3D Visualization, Printing On Any Surface, Rebuilding Reality, and Emotions as Data
- For Example — amazing discussion of 3D visualization techniques, full of examples using the D3.js library and bl.ocks.org example gist system. Gorgeous and informative.
- Anti-Gravity 3D Printer — uses strands to sculpt on any surface. (via Slashdot)
- How 3D Printing Will Rebuild Reality (BoingBoing) — But even though home 3D-printing has received substantial publicity of late, it is in the industrial sector where the technology will probably make its most significant near-term impact on the world both by manufacturing improved commercial products and by stimulating industry to develop next-generation fab methods and machines that could one day truly bring 3D-printing home to users in a real way.
- The Emotional Side of Big Data — Personal Democracy Forum 2013 talk by Sara Critchfield, on reframing emotion as data for decision-making. (via Quartz)