"testing" entries

Four short links: 21 May 2014

Four short links: 21 May 2014

Funnel Tool, Security Tools, Inside Mac Malware, and Everything is Broken

  1. EventHub — open source funnel/cohort/a-b analysis tool.
  2. Mantra — a collection of free/open source security tools, integrated into a browser (Firefox or Chromium).
  3. Reverse Engineering Mac Malware (PDF) — fascinating to see how it’s shipped, bundled, packaged, and distributed.
  4. Everything is Broken (Quinn Norton) — Computers have gotten incredibly complex, while people have remained the same gray mud with pretensions of Godhood. Today’s required read, because everything is broken and it’s the defining characteristic of this age of software. We have built computers in our image: our cancerous STD-addled diabetic alcoholic lead-sniffing telomere-decaying bacteria- and virus-addled image.
Four short links: 30 April 2014

Four short links: 30 April 2014

Critical Making, Torrent Filesystem, Testing Infrastructure, and Reproducible Research

  1. Critical Making — essays from 70 contributors looking at the politics, choices, and ethics of a lot of the makery going on.
  2. torrent-mount — mount a torrent as a filesystem in real time using Javascript. (via Joe McCann)
  3. Continuous Integration for Infrastructure — slides on the emerging tools for large-scale automated testing integrated into development and deployment workflow.
  4. Implementing Reproducible Research — book by Victoria Stodden and Johanna Cohoon on tools, practices, and platforms for making science that others can verify (another step in improving velocity and quality of scientific research).
Four short links: 28 April 2014

Four short links: 28 April 2014

Retail Student Data, Hacking Hospitals, Testing APIs, and Becoming Superhuman

  1. UK Government to Sell Its Students’ Data (Wired UK) — The National Pupil Database (NPD) contains detailed information about pupils in schools and colleges in England, including test and exam results, progression at each key stage, gender, ethnicity, pupil absence and exclusions, special educational needs, first language. The UK is becoming patient zero for national data self-harm.
  2. It’s Insanely Easy to Hack Hospital Equipment (Wired) — Erven won’t identify specific product brands that are vulnerable because he’s still trying to get some of the problems fixed. But he said a wide cross-section of devices shared a handful of common security holes, including lack of authentication to access or manipulate the equipment; weak passwords or default and hardcoded vendor passwords like “admin” or “1234″; and embedded web servers and administrative interfaces that make it easy to identify and manipulate devices once an attacker finds them on a network.
  3. Postman — API testing tool.
  4. App Controlled Hearing Aid Improves Even Normal Hearing (NYTimes) — It’s only a slight exaggeration to say that the latest crop of advanced hearing aids are better than the ears most of us were born with. Human augmentation with software and hardware.
Four short links: 24 February 2014

Four short links: 24 February 2014

Your Brain on Code, Internet of Compromised Things, Waiting for Wearables, and A/B Illusions

  1. Understanding Understanding Source Code with Functional Magnetic Resonance Imaging (PDF) — we observed 17 participants inside an fMRI scanner while they were comprehending short source-code snippets, which we contrasted with locating syntax error. We found a clear, distinct activation pattern of five brain regions, which are related to working memory, attention, and language processing. I’m wary of fMRI studies but welcome more studies that try to identify what we do when we code. (Or, in this case, identify syntax errors—if they wanted to observe real programming, they’d watch subjects creating syntax errors) (via Slashdot)
  2. Oobleck Security (O’Reilly Radar) — if you missed or skimmed this, go back and reread it. The future will be defined by the objects that turn on us. 50s scifi was so close but instead of human-shaped positronic robots, it’ll be our cars, HVAC systems, light bulbs, and TVs. Reminds me of the excellent Old Paint by Megan Lindholm.
  3. Google Readying Android Watch — just as Samsung moves away from Android for smart watches and I buy me and my wife a Pebble watch each for our anniversary. Watches are in the same space as Goggles and other wearables: solutions hunting for a problem, a use case, a killer tap. “OK Google, show me offers from brands I love near me” isn’t it (and is a low-lying operating system function anyway, not a userland command).
  4. Most Winning A/B Test Results are Illusory (PDF) — Statisticians have known for almost a hundred years how to ensure that experimenters don’t get misled by their experiments […] I’ll show how these methods ensure equally robust results when applied to A/B testing.
Four short links: 4 February 2014

Four short links: 4 February 2014

UX Fundamentals, Mozilla Persona, Pi Tests, and The Holodeck

  1. UX Fundamentals, Crash Course — 31 posts introducing the fundamental practices and mindsets of UX.
  2. Why We Love Persona And You Should Too — Mozilla’s identity system is an interesting offering. Fancy that, you might have single-sign on without Single Pwn-On.
  3. Raspberry Pi As Test Harness — Pi accessory maker uses Pis to automate the testing of his … it’s Pis all the way down.
  4. The Holodeck Begins to Take Shape — displays, computation, and interesting input devices, are coming together in various guises.

Upward Mobility: The Terror of iOS App Submission

Getting apps into the store is a non-deterministic process

One of the major topics of my Enterprise iOS book is how to plan release schedules around  Apple’s peril-filled submission process. I don’t think you can count yourself a truly bloodied iOS dev until you’ve gotten your first rejection notice from iTunes Connect, especially under deadline pressure.

Traditionally, the major reasons that applications would bounce is that the developer had been a Bad Person. They had grossly abused the Human Interface standards, or had a flakey app that crashed when the tester fired it up, or used undocumented internal system calls. In most cases, the rejection could have been anticipated if the developer had done his homework. There were occasional apps that got rejected for bizarre reasons, such as perceived adult content, or because of some secret Apple agenda, but they were the rare exception. If you followed the rules, your app would get in the store.

Read more…

Documentation as Testing

Can explanation contribute to technology creation?

“If you’re explaining, you’re losing.”

That gem of political wisdom has always been hard for me to take, as, after all, I make my living at explaining technology. I don’t feel like I’m losing. And yet…

It rings true. It’s not that programs and devices shouldn’t need documentation, but rather that documentation is an opportunity to find out just how complex a tool is. The problem is less that documentation writers are losing when they’re explaining, and more that creators of software and devices are losing when they have to settle for “fix in documentation.”

I was delighted last week to hear from Doug Schepers of webplatform.org that they want to “tighten the feedback loop between specification and documentation to make the specifications better.” Documentation means that someone has read and attempted to explain the specification to a broader audience, and the broader audience can then try things out and add their own comments. Writing documentation with that as an explicit goal is a much happier approach than the usual perils of documentation writers, trapped explaining unfixable tools whose creators apparently never gave much thought to explaining them.

It’s not just WebPlatform.org. I’ve praised the Elixir community for similar willingness to listen when people writing documentation (internal or external) report difficulties. When something is hard to explain, there’s usually some elegance missing. Developers writing their own documentation sometimes find it, but it can be easier to see the seams when you aren’t the one creating them.
Read more…

What Developers Can Learn from Healthcare.gov

Remember, even a failure can serve as an example of what not to do

The first highly visible component of the Affordable Health Care Act launched this week, in the form of the healthcare.gov site. Theoretically, it allows citizens, who live in any of the states that have chosen not to implement their own portal, to get quotes and sign up for coverage.

I say theoretically because I’ve been trying to get a quote out of it since it launched on Tuesday, and I’m still trying. Every time I think I’ve gotten past the last glitch, a new one shows up further down the line. While it’s easy to write it off as yet another example of how the government (under any administration) seems to be incapable of delivering large software projects, there are some specific lessons that developers can take away.

Read more…

Four short links: 22 August 2013

Four short links: 22 August 2013

Cryptanalysis Tools, Renaissance Hackers, MakerCamp Review, and Visual Regressions

  1. bletchley (Google Code) — Bletchley is currently in the early stages of development and consists of tools which provide: Automated token encoding detection (36 encoding variants); Passive ciphertext block length and repetition analysis; Script generator for efficient automation of HTTP requests; A flexible, multithreaded padding oracle attack library with CBC-R support.
  2. Hackers of the RenaissanceFour centuries ago, information was as tightly guarded by intellectuals and their wealthy patrons as it is today. But a few episodes around 1600 confirm that the Hacker Ethic and its attendant emphasis on open-source information and a “hands-on imperative” was around long before computers hit the scene. (via BoingBoing)
  3. Maker Camp 2013: A Look Back (YouTube) — This summer, over 1 million campers made 30 cool projects, took 6 epic field trips, and met a bunch of awesome makers.
  4. huxley (Github) — Watches you browse, takes screenshots, tells you when they change. Huxley is a test-like system for catching visual regressions in Web applications. (via Alex Dong)

NoSQL Choices: To Misfit or Cargo Cult?

Retreading old topics can be a powerful source of epiphany, sometimes more so than simple extra-box thinking. I was a computer science student, of course I knew statistics. But my recent years as a NoSQL (or better stated: distributed systems) junkie have irreparably colored my worldview, filtering every metaphor with a tinge of information management.

Lounging on a half-world plane ride has its benefits, namely, the opportunity to read. Most of my Delta flight from Tel Aviv back home to Portland lacked both wifi and (in my case) a workable laptop power source. So instead, I devoured Nate Silver’s book, The Signal and the Noise. When Nate reintroduced me to the concept of statistical overfit, and relatedly underfit, I could not help but consider these cases in light of the modern problem of distributed data management, namely, operators (you may call these operators DBAs, but please, not to their faces).

When collecting information, be it for a psychological profile of chimp mating rituals, or plotting datapoints in search of the Higgs Boson, the ultimate goal is to find some sort of usable signal, some trend in the data. Not every point is useful, and in fact, any individual could be downright abnormal. This is why we need several points to spot a trend. The world rarely gives us anything clearer than a jumble of anecdotes. But plotted together, occasionally a pattern emerges. This pattern, if repeatable and useful for prediction, becomes a working theory. This is science, and is generally considered a good method for making decisions.

On the other hand, when lacking experience, we tend to over value the experience of others when we assume they have more. This works in straightforward cases, like learning to cook a burger (watch someone make one, copy their process). This isn’t so useful as similarities diverge. Watching someone make a cake won’t tell you much about the process of crafting a burger. Folks like to call this cargo cult behavior.

How Fit are You, Bro?

You need to extract useful information from experience (which I’ll use the math-y sounding word datapoints). Having a collection of datapoints to choose from is useful, but that’s only one part of the process of decision-making. I’m not speaking of a necessarily formal process here, but in the case of database operators, merely a collection of experience. Reality tends to be fairly biased toward facts (despite the desire of many people for this to not be the case). Given enough experience, especially if that experience is factual, we tend to make better and better decisions more inline with reality. That’s pretty much the essence of prediction. Our mushy human brains are more-or-less good at that, at least, better than other animals. It’s why we have computers and Everybody Loves Raymond, and my cat pees in a box.

Imagine you have a sufficient amount of relevant datapoints that you can plot on a chart. Assuming the axes have any relation to each other, and the data is sound, a trend may emerge, such as a line, or some other bounding shape. A signal is relevant data that corresponds to the rules we discover by best fit. Noise is everything else. It’s somewhat circular sounding logic, and it’s really hard to know what is really a signal. This is why science is hard, and so is choosing a proper database. We’re always checking our assumptions, and one solid counter signal can really be disastrous for a model. We may have been wrong all along, missing only enough data. As Einstein famously said in response to the book 100 Authors Against Einstein: “If I were wrong, then one would have been enough!”

Database operators (and programmers forced to play this role) must make predictions all the time, against a seemingly endless series of questions. How much data can I handle? What kind of latency can I expect? How many servers will I need, and how much work to manage them?

So, like all decision making processes, we refer to experience. The problem is, as our industry demands increasing scale, very few people actually have much experience managing giant scale systems. We tend to draw our assumptions from our limited, or biased smaller scale experience, and extrapolate outward. The theories we then tend to concoct are not the optimal fit that we desire, but instead tend to be overfit.

Overfit is when we have a limited amount of data, and overstate its general implications. If we imagine a plot of likely failure scenarios against a limited number of servers, we may be tempted to believe our biggest odds of failure are insufficient RAM, or disk failure. After all, my network has never given me problems, but I sure have lost a hard drive or two. We take these assumptions, which are only somewhat relevant to the realities of scalable systems and divine some rules for ourselves that entirely miss the point.

overfitting

fitting

In a real distributed system, network issues tend to consume most of our interest. Single-server consistency is a solved problem, and most (worthwhile) distributed databases have some sense of built in redundancy (usually replication, the root of all distributed evil).
Read more…