- The Remixing Dilemma — summary of research on remixed projects, finding that (1) Projects with moderate amounts of code are remixed more often than either very simple or very complex projects. (2) Projects by more prominent creators are more generative. (3) Remixes are more likely to attract remixers than de novo projects.
- Scratch 2.0 — my favourite first programming language for kids and adults, now in the browser! Downloadable version for offline use coming soon. See the overview for what’s new.
- State Dept Takedown on 3D-Printed Gun (Forbes) — The government says it wants to review the files for compliance with arms export control laws known as the International Traffic in Arms Regulations, or ITAR. By uploading the weapons files to the Internet and allowing them to be downloaded abroad, the letter implies Wilson’s high-tech gun group may have violated those export controls.
- Data Science of the Facebook World (Stephen Wolfram) — More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes. A few weeks ago we decided to start analyzing all this data… (via Phil Earnhardt)
ENTRIES TAGGED "Big Data"
Making sense of the hype-cycle scuffle.
Ideas on avoiding the data science equivalent of "repair-ware."
Processing for Illustrator, Archiving Tools, Sweet Retro Art, and More Database Tools
- Drawscript — Processing for Illustrator. (via BERG London)
- Archive Team Warrior — a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive. (via Ed Vielmetti)
- Retro Vectors — royalty-free and free of charge.
- TokutekDB Goes Open Source — a high-performance, transactional storage engine for MySQL and MariaDB. See the announcement.
Sterling on Disruption, Coding Crypto Fun, Distributed File System, and Asset Packaging
- Bruce Sterling on Disruption — If more computation, and more networking, was going to make the world prosperous, we’d be living in a prosperous world. And we’re not. Obviously we’re living in a Depression. Slow first 25% but then it takes fire and burns with the heat of a thousand Sun Microsystems flaming out. You must read this now.
- The Matasano Crypto Challenges (Maciej Ceglowski) — To my delight, though, I was able to get through the entire sequence. It took diligence, coffee, and a lot of graph paper, but the problems were tractable. And having completed them, I’ve become convinced that anyone whose job it is to run a production website should try them, particularly if you have no experience with application security. Since the challenges aren’t really documented anywhere, I wanted to describe what they’re like in the hopes of persuading busy people to take the plunge.
- Tachyon — a fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. Berkeley-licensed open source.
Bitcoin Bundle, HTML Escaping, Open as in Gongkai, and Glass Reflections
- The Well Deserved Fortune of Satoshi Nakamoto — I can’t assure with 100% certainty that the all the black dots are owned by Satoshi, but almost all are owned by a single entity, and that entity began mining right from block 1, and with the same performance as the genesis block. It can be identified by constant slope segments that occasionally restart. Also this entity is the only entity that has shown complete trust in Bitcoin, since it hasn’t spend any coins (as last as the eye can see). I estimate at eyesight that Satoshi fortune is around 1M Bitcoins, or 100M USD at current exchange rate. Author’s credible. (via Hacker News)
- Houdini (Github) — C library for escaping and unescaping UTF-8-encoded HTML, according to OWASP guidelines.
- The $12 Gongkai Phone (Bunnie Huang) — gongkai isn’t a totally lawless free-for-all. It’s a network of ideas, spread peer-to-peer, with certain rules to enforce sharing and to prevent leeching. It’s very different from Western IP concepts, but I’m trying to have an open mind about it.
- Jan Chipchase on Google Glass (All Things D) — Any idiot can collect data. The real issue is how to collect data in such a way that meets both moral and legal obligations and still delivers some form of value. An interesting observation, one of many within this overview of the usability and third-party user experience of Google Glass-like UIs.
Email Triage, Pulse Detection, Big Building Data, and Raspberryduino Ardpi
- Triage — iPhone app to quickly triage your email in your downtime. See also the backstory. Awesome UI.
- Webcam Pulse Detector — I was wondering how long it would take someone to do the Eulerian video magnification in real code. Now I’m wondering how long it will take the patent-inspired takedown…
- How Microsoft Quietly Built the City of the Future — The team now collects 500 million data transactions every 24 hours, and the smart buildings software presents engineers with prioritized lists of misbehaving equipment. Algorithms can balance out the cost of a fix in terms of money and energy being wasted with other factors such as how much impact fixing it will have on employees who work in that building. Because of that kind of analysis, a lower-cost problem in a research lab with critical operations may rank higher priority-wise than a higher-cost fix that directly affects few. Almost half of the issues the system identifies can be corrected in under a minute, Smith says.
- UDOO (Kickstarter) — mini PC that could run either Android or Linux, with an Arduino-compatible board embedded. Like faster Raspberry Pi but with Arduino Due-compatible I/O.
Wikileaks Code, Account Afterlife, Digital in Museums, and Companies and Conferences
- Wikileaks ProjectK Code (Github) — open-sourced map and graph modules behind the Wikileaks code serving Kissinger-era cables. (via Journalism++)
- Plan Your Digital Afterlife With Inactive Account Manager — you can choose to have your data deleted — after three, six, nine or 12 months of inactivity. Or you can select trusted contacts to receive data from some or all of the following services: +1s; Blogger; Contacts and Circles; Drive; Gmail; Google+ Profiles, Pages and Streams; Picasa Web Albums; Google Voice and YouTube. Before our systems take any action, we’ll first warn you by sending a text message to your cellphone and email to the secondary address you’ve provided. (via Chris Heathcote)
- Leo Caillard: Art Games — Caillard’s images show museum patrons interacting with priceless paintings the way someone might browse through slides in a personal iTunes library on a device like an iPhone or MacBook. Playful and thought-provoking. (via Beta Knowledge)
- Lanyrd Pro — helping companies keep track of which events their engineers speak at, so they can avoid duplication and have maximum opportunity to promote it. First paid product from ETecher and Foo Simon Willison’s startup.
Expanded rules for data sharing in the U.S. government will need more oversight as predictive algorithms are applied.
Binary Data Is Back, Scala Data, Visualization Grammar, and Pastebin Monitor
- Capn Proto — open source faster protocol buffers (binary data interchange format and RPC system).
- Saddle — a high performance data manipulation library for Scala.
- Vega — a visualization grammar, a declarative format for creating, saving and sharing visualization designs. (via Flowing Data)
- dumpmon — Twitter bot that monitors paste sites for password dumps and other sensitive information. Source on github, see the announcement for more.