- For Example — amazing discussion of 3D visualization techniques, full of examples using the D3.js library and bl.ocks.org example gist system. Gorgeous and informative.
- Anti-Gravity 3D Printer — uses strands to sculpt on any surface. (via Slashdot)
- How 3D Printing Will Rebuild Reality (BoingBoing) — But even though home 3D-printing has received substantial publicity of late, it is in the industrial sector where the technology will probably make its most significant near-term impact on the world both by manufacturing improved commercial products and by stimulating industry to develop next-generation fab methods and machines that could one day truly bring 3D-printing home to users in a real way.
- The Emotional Side of Big Data — Personal Democracy Forum 2013 talk by Sara Critchfield, on reframing emotion as data for decision-making. (via Quartz)
ENTRIES TAGGED "Big Data"
3D Visualization, Printing On Any Surface, Rebuilding Reality, and Emotions as Data
Open Source BigTable, Robots Lost, Changing the World, Secrecy Binge
- Accumulo — NSA’s BigTable implementation, released as an Apache project.
- How the Robots Lost (Business Week) — the decline of high-frequency trading profits (basically, markets worked and imbalances in speed and knowledge have been corrected). Notable for the regulators getting access to the technology that the traders had: Last fall the SEC said it would pay Tradeworx, a high-frequency trading firm, $2.5 million to use its data collection system as the basic platform for a new surveillance operation. Code-named Midas (Market Information Data Analytics System), it scours the market for data from all 13 public exchanges. Midas went live in February. The SEC can now detect anomalous situations in the market, such as a trader spamming an exchange with thousands of fake orders, before they show up on blogs like Nanex and ZeroHedge. If Midas sees something odd, Berman’s team can look at trading data on a deeper level, millisecond by millisecond.
- PRISM: Surprised? (Danny O’Brien) — I really don’t agree with the people who think “We don’t have the collective will”, as though there’s some magical way things got done in the past when everyone was in accord and surprised all the time. It’s always hard work to change the world. Endless, dull hard work. Ten years later, when you’ve freed the slaves or beat the Nazis everyone is like “WHY CAN’T IT BE AS EASY TO CHANGE THIS AS THAT WAS, BACK IN THE GOOD OLD DAYS. I GUESS WE’RE ALL JUST SHEEPLE THESE DAYS.”
- What We Don’t Know About Spying on Citizens is Scarier Than What We Do Know (Bruce Schneier) — The U.S. government is on a secrecy binge. It overclassifies more information than ever. And we learn, again and again, that our government regularly classifies things not because they need to be secret, but because their release would be embarrassing. Open source BigTable implementation: free. Data gathering operation around it: $20M/year. Irony in having the extent of authoritarian Big Brother government secrecy questioned just as a whistleblower’s military trial is held “off the record”: priceless.
It's not the data itself but what you do with it that counts.
Distributed Browser-Based Computation, Streaming Regex, Preventing SQL Injections, and SVM for Faster Deep Learning
- WeevilScout — browser app that turns your browser into a worker for distributed computation tasks. See the poster (PDF). (via Ben Lorica)
- sregex (Github) — A non-backtracking regex engine library for large data streams. See also slide notes from a YAPC::NA talk. (via Ivan Ristic)
- Bobby Tables — a guide to preventing SQL injections. (via Andy Lester)
- Deep Learning Using Support Vector Machines (Arxiv) — we are proposing to train all layers of the deep networks by backpropagating gradients through the top level SVM, learning features of all layers. Our experiments show that simply replacing softmax with linear SVMs gives significant gains on datasets MNIST, CIFAR-10, and the ICML 2013 Representation Learning Workshop’s face expression recognition challenge. (via Oliver Grisel)
Skepticism isn't a blanket rejection of data; it's central to understanding data.
Making sense of the hype-cycle scuffle.
Ideas on avoiding the data science equivalent of "repair-ware."
Processing for Illustrator, Archiving Tools, Sweet Retro Art, and More Database Tools
- Drawscript — Processing for Illustrator. (via BERG London)
- Archive Team Warrior — a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive. (via Ed Vielmetti)
- Retro Vectors — royalty-free and free of charge.
- TokutekDB Goes Open Source — a high-performance, transactional storage engine for MySQL and MariaDB. See the announcement.
Sterling on Disruption, Coding Crypto Fun, Distributed File System, and Asset Packaging
- Bruce Sterling on Disruption — If more computation, and more networking, was going to make the world prosperous, we’d be living in a prosperous world. And we’re not. Obviously we’re living in a Depression. Slow first 25% but then it takes fire and burns with the heat of a thousand Sun Microsystems flaming out. You must read this now.
- The Matasano Crypto Challenges (Maciej Ceglowski) — To my delight, though, I was able to get through the entire sequence. It took diligence, coffee, and a lot of graph paper, but the problems were tractable. And having completed them, I’ve become convinced that anyone whose job it is to run a production website should try them, particularly if you have no experience with application security. Since the challenges aren’t really documented anywhere, I wanted to describe what they’re like in the hopes of persuading busy people to take the plunge.
- Tachyon — a fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. Berkeley-licensed open source.