ENTRIES TAGGED "algorithms"
Real Time Exploratory Analytics, Algorithmic Agendas, Disassembly Engine, and Future of Employment
- Druid — open source clustered data store (not key-value store) for real-time exploratory analytics on large datasets.
- It’s Time to Engineer Some Filter Failure (Jon Udell) — Our filters have become so successful that we fail to notice: We don’t control them, They have agendas, and They distort our connections to people and ideas. That idea that algorithms have agendas is worth emphasising. Reality doesn’t have an agenda, but the deployer of a similarity metric has decided what features to look for, what metric they’re optimising, and what to do with the similarity data. These are all choices with an agenda.
- Capstone — open source multi-architecture disassembly engine.
- The Future of Employment (PDF) — We note that this prediction implies a truncation in the current trend towards labour market polarization, with growing employment in high and low-wage occupations, accompanied by a hollowing-out of middle-income jobs. Rather than reducing the demand for middle-income occupations, which has been the pattern over the past decades, our model predicts that computerisation will mainly substitute for low-skill and low-wage jobs in the near future. By contrast, high-skill and high-wage occupations are the least susceptible to computer capital. (via The Atlantic)
Inside the Nest Protect, Log Structures, Predictions, and In-Memory Data Cubes
- Nest Protect Teardown (Sparkfun) — initial teardown of another piece of domestic industrial Internet.
- Logs — The distributed log can be seen as the data structure which models the problem of consensus. Not kidding when he calls it “real-time data’s unifying abstraction”.
- Mining the Web to Predict Future Events (PDF) — Mining 22 years of news stories to predict future events. (via Ben Lorica)
- Nanocubes — a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs – Research. Nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases it uses sufficiently little memory that you can run a nanocube in a modern-day laptop. (via Ben Lorica)
Downloading Kindle Highlights, Balanced Photos, Long Form, and Crap Regulation
- bookcision — bookmarklet to download your Kindle highlights. (via Nelson Minar)
- Algorithm for a Perfectly Balanced Photo Gallery — remember this when it comes time to lay out your 2013 “Happy Holidays!” card.
- Long Stories (Fast Company Labs) — Our strategy was to still produce feature stories as discrete articles, but then to tie them back to the stub article with lots of prominent links, again taking advantage of the storyline and context we had built up there, making our feature stories sharper and less full of catch-up material.
- Massachusetts Software Tax (Fast Company Labs) — breakdown of why this crappily-written law is bad news for online companies. Laws are the IEDs of the Internet: it’s easy to make massively value-destroying regulation and hard to get it fixed.
Algorithmic Optimisation, 3D Scanners, Corporate Open Source, and Data Dives
- Unhappy Truckers and Other Algorithmic Problems — Even the insides of vans are subjected to a kind of routing algorithm; the next time you get a package, look for a three-letter letter code, like “RDL.” That means “rear door left,” and it is so the driver has to take as few steps as possible to locate the package. (via Sam Minnee)
- Fuel3D: A Sub-$1000 3D Scanner (Kickstarter) — a point-and-shoot 3D imaging system that captures extremely high resolution mesh and color information of objects. Fuel3D is the world’s first 3D scanner to combine pre-calibrated stereo cameras with photometric imaging to capture and process files in seconds.
- Corporate Open Source Anti-Patterns (YouTube) — Brian Cantrill’s talk, slides here. (via Daniel Bachhuber)
- Hacking for Humanity) (The Economist) — Getting PhDs and data specialists to donate their skills to charities is the idea behind the event’s organizer, DataKind UK, an offshoot of the American nonprofit group.
Distributed Browser-Based Computation, Streaming Regex, Preventing SQL Injections, and SVM for Faster Deep Learning
- WeevilScout — browser app that turns your browser into a worker for distributed computation tasks. See the poster (PDF). (via Ben Lorica)
- sregex (Github) — A non-backtracking regex engine library for large data streams. See also slide notes from a YAPC::NA talk. (via Ivan Ristic)
- Bobby Tables — a guide to preventing SQL injections. (via Andy Lester)
- Deep Learning Using Support Vector Machines (Arxiv) — we are proposing to train all layers of the deep networks by backpropagating gradients through the top level SVM, learning features of all layers. Our experiments show that simply replacing softmax with linear SVMs gives significant gains on datasets MNIST, CIFAR-10, and the ICML 2013 Representation Learning Workshop’s face expression recognition challenge. (via Oliver Grisel)
Comparing Algorithms, Programming & Visual Arts, Data Brokers, and Your Brain on Ebooks
- mlcomp — a free website for objectively comparing machine learning programs across various datasets for multiple problem domains.
- Printing Code: Programming and the Visual Arts (Vimeo) — Rune Madsen’s talk from Heroku’s Waza. (via Andrew Odewahn)
- What Data Brokers Know About You (ProPublica) — excellent run-down on the compilers of big data about us. Where are they getting all this info? The stores where you shop sell it to them.
- Subjective Impressions Do Not Mirror Online Reading Effort: Concurrent EEG-Eyetracking Evidence from the Reading of Books and Digital Media (PLOSone) — Comprehension accuracy did not differ across the three media for either group and EEG and eye fixations were the same. Yet readers stated they preferred paper. That preference, the authors conclude, isn’t because it’s less readable. From this perspective, the subjective ratings of our participants (and those in previous studies) may be viewed as attitudes within a period of cultural change.
Comms 101, RoboTurking, Geek Tourism, and Implementing Papers
- How to Redesign Your App Without Pissing Everybody Off (Anil Dash) — the basic straightforward stuff that gets your users on-side. Anil’s making a career out of being an adult.
- Clockwork Raven (Twitter) — open source project to send data analysis tasks to Mechanical Turkers.
- Updates from the Tour in China (Bunnie Huang) — my dream geek tourism trip: going around Chinese factories and bazaars with MIT geeks.
- How to Implement an Algorithm from a Scientific Paper — I have implemented many complex algorithms from books and scientific publications, and this article sums up what I have learned while searching, reading, coding and debugging. (via Siah)