- On Being a Senior Engineer (Etsy) — Mature engineers know that no matter how complete, elegant, or superior their designs are, it won’t matter if no one wants to work alongside them because they are assholes.
- Control Theory (Coursera) — Learn about how to make mobile robots move in effective, safe, predictable, and collaborative ways using modern control theory. (via DIY Drones)
- US Moves Towards Open Access (WaPo) — Congress passed a budget that will make about half of taxpayer-funded research available to the public.
- NHS Patient Data Available for Companies to Buy (The Guardian) — Once live, organisations such as university research departments – but also insurers and drug companies – will be able to apply to the new Health and Social Care Information Centre (HSCIC) to gain access to the database, called care.data. If an application is approved then firms will have to pay to extract this information, which will be scrubbed of some personal identifiers but not enough to make the information completely anonymous – a process known as “pseudonymisation”. Recipe for disaster as it has been repeatedly shown that it’s easy to identify individuals, given enough scrubbed data. Can’t see why the NHS just doesn’t make it an app in Facebook. “Nat’s Prostate status: it’s complicated.”
ENTRIES TAGGED "Big Data"
Mature Engineering, Control Theory, Open Access USA, and UK Health Data Too-Open?
Cognition as a Service, Levy on NSA, SD-Sized Computer, and Learning Research
- Launching the Wolfram Connected Devices Project — Wolfram Alpha is cognition-as-a-service, which they hope to embed in devices. This data-powered Brain-in-the-Cloud play will pit them against Google, but G wants to own the devices and the apps and the eyeballs that watch them … interesting times ahead!
- How the USA Almost Killed the Internet (Wired) — “At first we were in an arms race with sophisticated criminals,” says Eric Grosse, Google’s head of security. “Then we found ourselves in an arms race with certain nation-state actors [with a reputation for cyberattacks]. And now we’re in an arms race with the best nation-state actors.”
- Intel Edison — SD-card sized, with low-power 22nm 400MHz Intel Quark processor with two cores, integrated Wi-Fi and Bluetooth.
- N00b 2 L33t, Now With Graphs (Tom Stafford) — open science research validating many of the findings on learning, tested experimentally via games. In the present study, we analyzed data from a very large sample (N = 854,064) of players of an online game involving rapid perception, decision making, and motor responding. Use of game data allowed us to connect, for the first time, rich details of training history with measures of performance from participants engaged for a sustained amount of time in effortful practice. We showed that lawful relations exist between practice amount and subsequent performance, and between practice spacing and subsequent performance. Our methodology allowed an in situ confirmation of results long established in the experimental literature on skill acquisition. Additionally, we showed that greater initial variation in performance is linked to higher subsequent performance, a result we link to the exploration/exploitation trade-off from the computational framework of reinforcement learning.
4.6 million phone numbers, is one of them yours?
Inside the Nest Protect, Log Structures, Predictions, and In-Memory Data Cubes
- Nest Protect Teardown (Sparkfun) — initial teardown of another piece of domestic industrial Internet.
- Logs — The distributed log can be seen as the data structure which models the problem of consensus. Not kidding when he calls it “real-time data’s unifying abstraction”.
- Mining the Web to Predict Future Events (PDF) — Mining 22 years of news stories to predict future events. (via Ben Lorica)
- Nanocubes — a fast datastructure for in-memory data cubes developed at the Information Visualization department at AT&T Labs – Research. Nanocubes can be used to explore datasets with billions of elements at interactive rates in a web browser, and in some cases it uses sufficiently little memory that you can run a nanocube in a modern-day laptop. (via Ben Lorica)
Data Pipeline, Data Driven Education, Crowdsourced Proofreading, and 3D Printed Shoes
- Suro (Github) — Netflix data pipeline service for large volumes of event data. (via Ben Lorica)
- NIPS Workshop on Data Driven Education — lots of research papers around machine learning, MOOC data, etc.
- Proofist — crowdsourced proofreading game.
- 3D-Printed Shoes (YouTube) — LeWeb talk from founder of the company, Continuum Fashion). (via Brady Forrest)
Flexible Data, Google's Bottery, GPU Assist Deep Learning, and Open Sourcing
- Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
- Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
- What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.
Surveillance Demarcation, NYT Data Scientist, 2D Dart, and Bayesian Database
- Reform Government Surveillance — hard not to view this as a demarcation dispute. “Ruthlessly collecting every detail of online behaviour is something we do clandestinely for advertising purposes, it shouldn’t be corrupted because of your obsession over national security!”
- Brian Abelson — Data Scientist at the New York Times, blogging what he finds. He tackles questions like what makes a news app “successful” and how might we measure it. Found via this engaging interview at the quease-makingly named Content Strategist.
- StageXL — Flash-like 2D package for Dart.
- BayesDB — lets users query the probable implications of their data as easily as a SQL database lets them query the data itself. Using the built-in Bayesian Query Language (BQL), users with no statistics training can solve basic data science problems, such as detecting predictive relationships between variables, inferring missing values, simulating probable observations, and identifying statistically similar database entries. Open source.
AI Book, Science Superstars, Engineering Ethics, and Crowdsourced Science
- Society of Mind — Marvin Minsky’s book now Creative-Commons licensed.
- Collaboration, Stars, and the Changing Organization of Science: Evidence from Evolutionary Biology — The concentration of research output is declining at the department level but increasing at the individual level. [...] We speculate that this may be due to changing patterns of collaboration, perhaps caused by the rising burden of knowledge and the falling cost of communication, both of which increase the returns to collaboration. Indeed, we report evidence that the propensity to collaborate is rising over time. (via Sciblogs)
- As Engineers, We Must Consider the Ethical Implications of our Work (The Guardian) — applies to coders and designers as well.
- Eyewire — a game to crowdsource the mapping of 3D structure of neurons.
R GUI, Drone Regulations, Bitcoin Stats, and Android/iOS Money Shootout
- Deducer — An R Graphical User Interface (GUI) for Everyone.
- Integration of Civil Unmanned Aircraft Systems (UAS) in the National Airspace System (NAS) Roadmap (PDF, FAA) — first pass at regulatory framework for drones. (via Anil Dash)
- Bitcoin Stats — $21MM traded, $15MM of electricity spent mining. Goodness. (via Steve Klabnik)
- iOS vs Android Numbers (Luke Wroblewski) — roundup comparing Android to iOS in recent commerce writeups. More Android handsets, but less revenue per download/impression/etc.