Elasticsearch SQL — Query elasticsearch using familiar SQL syntax. You can also use ES functions in SQL. Apache2-licensed.
In Communist China, Tinder Screws You — Chinese Tinder clone Tantan is endangering young women and men by failing to use encryption and exposing private data like that made public in the Ashley Madison hack.
The Advertising Bubble (Maciej Ceglowski) — This is an article-length ad (1) targeted at companies selling software (2) to advertising startups (3) sellling their own ads (4) God knows where, possibly to some publishing startup burning through your grandmother’s pension fund (5,6,7,8). There’s an ad bubble. It’s gonna blow.
Fortran for LLVM — The U.S. Department of Energy’s National Nuclear Security Administration (NNSA) and its three national labs today announced they have reached an agreement with NVIDIA’s PGI® software to create an open source Fortran compiler designed for integration with the widely used LLVM compiler infrastructure. Rumor has it the nuclear labs will defer implementation of READ DRUM to later generations.
GIF It Up — very clever remix campaign to use heritage content—Friday is your last day to enter this year’s contest, so get creating! My favourite.
Uber’s Drivers: Information Asymmetries and Control in Dynamic Work — Our conclusions are two-fold: first, that the information asymmetries produced by Uber’s system are fundamental to its ability to structure indirect control over its workers; and second, that Uber relies heavily on the evolving rhetoric of the algorithm to justify these information asymmetries to drivers, riders, as well as regulators and outlets of public opinion.
ANNABELL — unsupervised language learning using artificial neural networks, install your own four year old. The paper explains how.
Spinnaker — an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.
Hospital Hacking (Bloomberg) — interesting for both lax regulation (“The FDA seems to literally be waiting for someone to be killed before they can say, ‘OK, yeah, this is something we need to worry about,’ ” Rios says.) and the extent of the problem (Last fall, analysts with TrapX Security, a firm based in San Mateo, Calif., began installing software in more than 60 hospitals to trace medical device hacks. […] After six months, TrapX concluded that all of the hospitals contained medical devices that had been infected by malware.). It may take a Vice President’s defibrillator being hacked for things to change. Or would anybody notice?
TensorFlow — Google released, as open source, their distributed machine learning system. The DataFlow programming framework is sweet, and the documentation is gorgeous. AMAZINGLY high-quality, sets the bar for any project. This may be 2015’s most important software release.
TensorFlow White Paper (PDF) — Compared to DistBelief [G’s first scalable distributed inference and training system], TensorFlow’s programming model is more flexible, its performance is significantly better, and it supports training and using a broader range of models on a wider variety of heterogeneous hardware platforms.
Neural Networks With Few Multiplications — paper with a method to eliminate most of the time-consuming floating point multiplications needed to update the intermediate virtual neurons as they learn. Speed has been one of the bugbears of deep neural networks.
Cybersecurity as RealPolitik — Dan Geer’s excellent talk from 2014 BlackHat. When younger people ask my advice on what they should do or study to make a career in cyber security, I can only advise specialization. Those of us who were in the game early enough and who have managed to retain an over-arching generalist knowledge can’t be replaced very easily because while absorbing most new information most of the time may have been possible when we began practice, no person starting from scratch can do that now. Serial specialization is now all that can be done in any practical way. Just looking at the Black Hat program will confirm that being really good at any one of the many topics presented here all but requires shutting out the demands of being good at any others.
Security and the Linux Kernel (WaPo) — the question is not “can the WaPo write intelligently about the Linux kernel and security?” (answer, by the way, is “yes”) but rather “why is the WaPo writing about Linux kernel and security?” Ladies and gentlemen, start your conspiracy engines.
TPP Might Prevent Governments from Auditing Source Code (Wired) — Article 14.17 of proposal, published at last today after years of secret negotiations, says: “No Party shall require the transfer of, or access to, source code of software owned by a person of another Party, as a condition for the import, distribution, sale or use of such software, or of products containing such software, in its territory.” The proposal includes an exception for critical infrastructure, but it’s not clear whether software involved in life or death situations, such as cars, airplanes, or medical devices would be included. One of many “what the heck does this mean for us?” analyses coming out. I’m waiting a few days until the analyses shake out before I get anything in a tangle.
Taiga — open source agile software project management tool (backlog, kanban, tasks, sprints, burndown charts, that sort of thing). (via Jef Vratny)
Confidant — a secret management system, for AWS, from Lyft. If you build services that need to talk to each other, it quickly gets difficult to distribute and manage permissions to those services. So, naturally, the solution is to add another service. (In accordance with the Fundamental Theorem of Computer Science.)
Gmail Suggesting Replies — In developing Smart Reply, we adhered to the same rigorous user privacy standards we’ve always held — in other words, no humans reading your email. This means researchers have to get machine learning to work on a data set that they themselves cannot read, which is a little like trying to solve a puzzle while blindfolded — but a challenge makes it more interesting!
The Selective Laziness of Reasoning — Among those participants who accepted the manipulation and thus thought they were evaluating someone else’s argument, more than half (56% and 58%) rejected the arguments that were in fact their own. Moreover, participants were more likely to reject their own arguments for invalid than for valid answers. This demonstrates that people are more critical of other people’s arguments than of their own, without being overly critical: They are better able to tell valid from invalid arguments when the arguments are someone else’s rather than their own.
Anti-Caching (PDF) — paper outlining a clever reframing of the database strategy of keeping frequently accessed things in-memory, namely pushing to disk the things that won’t be accessed … aka, “anti-caching.”
The Rating Game (Verge) — Until companies release ratings data, we can’t know for certain whether this is true, but a study of Airbnb users found that black hosts get less money for similar listings than white hosts, and another study found that white taxi drivers get higher tips than black ones. There’s no reason such biases wouldn’t carry over to ratings.
Singa — Apache distributed deep learning platform turns 1.0.
Australia Floating the Idea of Cloud Passports — Under a cloud passport, a traveller’s identity and biometrics data would be stored in a cloud, so passengers would no longer need to carry their passports and risk having them lost or stolen. That sound you hear is Taylor Swift on Security, quoting “Wildest Dreams” into her vodka and Tang: “I can see the end as it begins.” This article is also notable for The idea of cloud passports is the result of a hipster-style-hackathon.
Jupyter — Python Notebooks that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning, and much more.
Telcos $24B Business In Your Data — Under the radar, Verizon, Sprint, Telefonica, and other carriers have partnered with firms including SAP, IBM, HP, and AirSage to manage, package, and sell various levels of data to marketers and other clients. It’s all part of a push by the world’s largest phone operators to counteract diminishing subscriber growth through new business ventures that tap into the data that showers from consumers’ mobile Web surfing, text messaging, and phone calls. Even if you do pay for it, you’re still the product.
Introducing Agate — a Python data analysis library designed to be useable by non-data-scientists, so leads to readable and predictable code. Target market: data journalists.