- Machine Learning Done Wrong — [M]ost practitioners pick the modeling algorithm they are most familiar with rather than pick the one which best suits the data. In this post, I would like to share some common mistakes (the don’t-s).
- Bandits for Recommendations — A common problem for internet-based companies is: which piece of content should we display? Google has this problem (which ad to show), Facebook has this problem (which friend’s post to show), and RichRelevance has this problem (which product recommendation to show). Many of the promising solutions come from the study of the multi-armed bandit problem.
- Droplets — the Droplet is almost spherical, can self-right after being poured out of a bucket, and has the hardware capabilities to organize into complex shapes with its neighbors due to accurate range and bearing. Droplets are available open-source and use cheap vibration motors and a 3D printed shell. (via Robohub)
- Apple’s App Store Approval Guidelines — some of the plainest English I’ve seen, especially the Introduction. I can only aspire to that clarity. If your App looks like it was cobbled together in a few days, or you’re trying to get your first practice App into the store to impress your friends, please brace yourself for rejection. We have lots of serious developers who don’t want their quality Apps to be surrounded by amateur hour.
Machine Learning Mistakes, Recommendation Bandits, Droplet Robots, and Plain English
The collision of software and hardware has broken down the barriers between the digital and physical worlds.
Note: this post is a slightly hydrated version of my Solid keynote. To get it out in 10 minutes, I had to remove a few ideas and streamline it a bit for oral delivery; this is the full version.
In 1995, Nicolas Negroponte told us to forget about the atoms and focus on the bits. I think “being digital” was probably an intentional overstatement, a provocation to shove our thinking off of its metastable emphasis on the physical, to open us up to the power of the purely digital. And maybe it worked too well, because a lot of us spent two decades plumbing every possibility of digital-only technologies and digital-only businesses.
By then, technology had bifurcated into two streams of hardware and software that rarely converged outside of the data center, and for most of us, unless we were with a firm the size of Sony, with a huge addressable market, hardware was simply outside the scope of our entrepreneurial ambitions. It was our platform, but rarely our product. The physical world was for other people to worry about. We had become by then the engineers of the ephemeral, the plastic, and the immaterial. And in the depth of our immersion into the virtual and digital, we became, it seems, citizens of Weblandia (and congregants of the Church of Disruption).
But pendulums always swing back. Read more…
Filesharing Box, Realised Dystopias, Spam Ecosystem Research, and Technical Interviews
- PirateBox 1.0 — turns a wireless router into a filesharing joy. v1.0 has a responsive ui, among other things for use on tablets and phones.
- Dystopia Tracker — keep on top of which scifi dystopic predictions have been realised. I’d like filters for incubators, investors, and BigCos so you can see who is investing in dystopia.
- The Harvester, the Botmaster, and the Spammer (PDF) — research paper on the spam supply chain.
- Technical Interviewing (Moishe Lettvin) — lessons learned from conducting >250 technical interviews at Google. Why do I care? Chances are, your technical interviews suck so you’re hiring poorly.
Video Transparency, Software Traffic, Distributed Database, and Open Source Sustainability
- Video Quality Report — transparency is a great way to indirectly exert leverage.
- Control Your Traffic Flows with Software — using BGP to balance traffic. Will be interesting to see how the more extreme traffic managers deploy SDN in the data center.
- Cockroach — a distributed key/value datastore which supports ACID transactional semantics and versioned values as first-class features. The primary design goal is global consistency and survivability, hence the name. Cockroach aims to tolerate disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention. Cockroach nodes are symmetric; a design goal is one binary with minimal configuration and no required auxiliary services.
- Linux Foundation Providing for Core Infrastructure Projects — press release, but interested in how they’re tackling sustainability—they’re taking on identifying worthies (glad I’m not the one who says “you’re not worthy” to a project) and being the non-profit conduit for the dosh. Interesting: implies they think the reason companies weren’t supporting necessary open source projects was some combination of being unsure who to support (projects you use, surely?) and how to get them money (ask?). (Sustainability of open source projects is a pet interest of mine)
Modern Software Development, Internet Trends, Software Ethics, and Open Government Data
- Beyond the Stack (Mike Loukides) — tools and processes to support software developers who are as massively distributed as the code they build.
- Mary Meeker’s Internet Trends 2014 (PDF) — the changes on slide 34 are interesting: usage moving away from G+/Facebook-style omniblather creepware and towards phonebook-based chat apps.
- Introduction to Software Engineering Ethics (PDF) — amazing set of provocative questions and scenarios for software engineers about the decisions they made and consequences of their actions. From a course in ethics from SCU.
- Open Government Data Online: Impenetrable (Guardian) — Too much knowledge gets trapped in multi-page pdf files that are slow to download (especially in low-bandwidth areas), costly to print, and unavailable for computer analysis until someone manually or automatically extracts the raw data.
The tools in the Distributed Developer's Stack make development manageable in a highly distributed environment.
The shape of software development has changed radically in the last two decades. We’ve seen many changes: the Internet, the web, virtualization, and cloud computing. All of these changes point toward a fundamental new reality: all computing has become distributed computing. The age of standalone applications has disappeared, and applications that run on a single computer are almost inconceivable. Distributed is the default; and whether an application is running on Amazon Web Services (AWS), on a private cloud, or even on a desktop or a mobile phone, it depends on the behavior of other systems and services that aren’t under the developer’s control.
In the past few years, a new toolset has grown up to support the development of massively distributed applications. We call this new toolset the Distributed Developer’s Stack (DDS). It is orthogonal to the more traditional world of servers, frameworks, and operating systems; it isn’t a replacement for the aged LAMP stack, but a set of tools to make development manageable in a highly distributed environment.
The DDS is more of a meta-stack than a “stack” in the traditional sense. It’s not prescriptive; we don’t care whether you use AWS or OpenStack, whether you use Git or Mercurial. We do care that you develop for the cloud, and that you use a distributed version control system. The DDS is about the requirements for working effectively in the second decade of the 21st century. The specific tools have evolved, and will continue to evolve, and we expect you to evolve, too. Read more…