- bletchley (Google Code) — Bletchley is currently in the early stages of development and consists of tools which provide: Automated token encoding detection (36 encoding variants); Passive ciphertext block length and repetition analysis; Script generator for efficient automation of HTTP requests; A flexible, multithreaded padding oracle attack library with CBC-R support.
- Hackers of the Renaissance — Four centuries ago, information was as tightly guarded by intellectuals and their wealthy patrons as it is today. But a few episodes around 1600 confirm that the Hacker Ethic and its attendant emphasis on open-source information and a “hands-on imperative” was around long before computers hit the scene. (via BoingBoing)
- Maker Camp 2013: A Look Back (YouTube) — This summer, over 1 million campers made 30 cool projects, took 6 epic field trips, and met a bunch of awesome makers.
- huxley (Github) — Watches you browse, takes screenshots, tells you when they change. Huxley is a test-like system for catching visual regressions in Web applications. (via Alex Dong)
ENTRIES TAGGED "testing"
Cryptanalysis Tools, Renaissance Hackers, MakerCamp Review, and Visual Regressions
Retreading old topics can be a powerful source of epiphany, sometimes more so than simple extra-box thinking. I was a computer science student, of course I knew statistics. But my recent years as a NoSQL (or better stated: distributed systems) junkie have irreparably colored my worldview, filtering every metaphor with a tinge of information management.
Lounging on a half-world plane ride has its benefits, namely, the opportunity to read. Most of my Delta flight from Tel Aviv back home to Portland lacked both wifi and (in my case) a workable laptop power source. So instead, I devoured Nate Silver’s book, The Signal and the Noise. When Nate reintroduced me to the concept of statistical overfit, and relatedly underfit, I could not help but consider these cases in light of the modern problem of distributed data management, namely, operators (you may call these operators DBAs, but please, not to their faces).
When collecting information, be it for a psychological profile of chimp mating rituals, or plotting datapoints in search of the Higgs Boson, the ultimate goal is to find some sort of usable signal, some trend in the data. Not every point is useful, and in fact, any individual could be downright abnormal. This is why we need several points to spot a trend. The world rarely gives us anything clearer than a jumble of anecdotes. But plotted together, occasionally a pattern emerges. This pattern, if repeatable and useful for prediction, becomes a working theory. This is science, and is generally considered a good method for making decisions.
On the other hand, when lacking experience, we tend to over value the experience of others when we assume they have more. This works in straightforward cases, like learning to cook a burger (watch someone make one, copy their process). This isn’t so useful as similarities diverge. Watching someone make a cake won’t tell you much about the process of crafting a burger. Folks like to call this cargo cult behavior.
How Fit are You, Bro?
You need to extract useful information from experience (which I’ll use the math-y sounding word datapoints). Having a collection of datapoints to choose from is useful, but that’s only one part of the process of decision-making. I’m not speaking of a necessarily formal process here, but in the case of database operators, merely a collection of experience. Reality tends to be fairly biased toward facts (despite the desire of many people for this to not be the case). Given enough experience, especially if that experience is factual, we tend to make better and better decisions more inline with reality. That’s pretty much the essence of prediction. Our mushy human brains are more-or-less good at that, at least, better than other animals. It’s why we have computers and Everybody Loves Raymond, and my cat pees in a box.
Imagine you have a sufficient amount of relevant datapoints that you can plot on a chart. Assuming the axes have any relation to each other, and the data is sound, a trend may emerge, such as a line, or some other bounding shape. A signal is relevant data that corresponds to the rules we discover by best fit. Noise is everything else. It’s somewhat circular sounding logic, and it’s really hard to know what is really a signal. This is why science is hard, and so is choosing a proper database. We’re always checking our assumptions, and one solid counter signal can really be disastrous for a model. We may have been wrong all along, missing only enough data. As Einstein famously said in response to the book 100 Authors Against Einstein: “If I were wrong, then one would have been enough!”
Database operators (and programmers forced to play this role) must make predictions all the time, against a seemingly endless series of questions. How much data can I handle? What kind of latency can I expect? How many servers will I need, and how much work to manage them?
So, like all decision making processes, we refer to experience. The problem is, as our industry demands increasing scale, very few people actually have much experience managing giant scale systems. We tend to draw our assumptions from our limited, or biased smaller scale experience, and extrapolate outward. The theories we then tend to concoct are not the optimal fit that we desire, but instead tend to be overfit.
Overfit is when we have a limited amount of data, and overstate its general implications. If we imagine a plot of likely failure scenarios against a limited number of servers, we may be tempted to believe our biggest odds of failure are insufficient RAM, or disk failure. After all, my network has never given me problems, but I sure have lost a hard drive or two. We take these assumptions, which are only somewhat relevant to the realities of scalable systems and divine some rules for ourselves that entirely miss the point.
In a real distributed system, network issues tend to consume most of our interest. Single-server consistency is a solved problem, and most (worthwhile) distributed databases have some sense of built in redundancy (usually replication, the root of all distributed evil).
Velocity 2013 Speaker Series
If you’re a System Administrator, you’re likely all too familiar with the 2:35am PagerDuty alert. “When you roll out testing on your infrastructure,” says Seth Vargo, “the number of alerts drastically decreases because you can build tests right into your Chef cookbooks.” We sat down to discuss his upcoming talk at Velocity, which promises to deliver many more restful nights for SysAdmins.
Key highlights from our discussion include:
- There are not currently any standards regarding testing with Chef. [Discussed at 1:09]
- A recommended workflow that starts with unit testing [Discussed at 2:11]
- Moving cookbooks through a “pipeline” of testing with Test Kitchen [Discussed at 3:11]
- In the event that something bad does make it into production, you can roll back actual infrastructure changes. [Discussed at 4:54]
- Automating testing and cookbook uploads with Jenkins [Discussed at 5:40]
You can watch the full interview here:
User research you can do now
There’s a lot of advice about how to do great user research. I have some pretty strong opinions about it myself.
But, as with exercise, the best kind of research is the kind that you actually DO.
So, in the interests of getting some good feedback from your users right now, I have some suggestions for Tiny Tests. These are types of research that you could do right this second with very little preparation on your part.
What is a Tiny Test?
Tiny Tests do not take a lot of time. They don’t take a lot of money. All they take is a commitment to learning something from your users today.
Pick a Tiny Test that applies to your product and get out and run one right now. Oh, ok. You can wait until you finish the post.
Dozens of companies now exist that allow you to run an unmoderated test in a few minutes. I’ve used UserTesting.com many times and gotten some great results really quickly. I’ve also heard good things about Loop11 and several others, so feel free to pick the one that you like best.
Regular Expressions, Mobile Diversions, UX Pitfalls, and DIY Keyboarding
- RE2: A Principled Approach to Regular Expressions — a regular expression engine without backtracking, so without the potential for exponential pathological runtimes.
- Mobile is Entertainment (Luke Wroblewski) — 79% of mobile app time is spent on fun, even as desktop web use is declining.
- Five UX Research Pitfalls (Elaine Wherry) — I live this every day: Sometimes someone will propose an idea that doesn’t seem to make sense. While your initial reaction may be to be defensive or to point out the flaws in the proposed A/B study, you should consider that your buddy is responding to something outside your view and that you don’t have all of the data.
- Building a Keyboard: Part 1 (Jesse Vincent) — and Part 2 and general musings on the topic of keyboards. Jesse built his own. Yeah, he’s that badass.
Two core Scala libraries support features for mocking and data generation.
Scala, a language designed for well-structured and readable programs, is richly provisioned with testing frameworks. The community has adopted test-driven development (TDD) and behavior-driven development (BDD) with zeal. These represent the baseline for trustworthy code development today.
TDD and BDD expand beyond the traditional model of incorporating a test phase into the development process. Most programmers know that ad hoc debugging is not sufficient and that they need to run tests on isolated functions (unit testing) to make sure that a change doesn’t break anything (regression testing). But testing libraries available for Scala, in supporting TDD and BDD, encourage developers to write tests before they even write the code being tested.
Tests can be expressed in human-readable text reminiscent of natural language (although you can’t stretch the comparison too far) so that you are documenting what you want your code to do while expressing the test that ensures that code ultimately will meet your requirements.
Highlights from our discussion include: Read more…
An interview with Shipping Greatness author Chris Vander Mey.
Chris Vander Mey, CEO of Scaled Recognition, and author of a new O’Reilly book, Shipping Greatness, lays out in this video some of the deep lessons he learned during his years working on some very high-impact and high-priority projects at Google and Amazon.
Chris takes a very expansive view of project management, stressing the crucial decisions and attitudes that leaders need to take at every stage from the team’s initial mission statement through the design, coding, and testing to the ultimate launch. By merging technical, organizational, and cultural issues, he unravels some of the magic that makes projects successful.
Smart data scientists can make big problems small.
Having worked in academia, government and industry, I’ve had a unique opportunity to build products in each sector. Much of this product development has been around building data products. Just as methods for general product development have steadily improved, so have the ideas for developing data products. Thanks to large investments in the general area of data science, many major innovations (e.g., Hadoop, Voldemort, Cassandra, HBase, Pig, Hive, etc.) have made data products easier to build. Nonetheless, data products are unique in that they are often extremely difficult, and seemingly intractable for small teams with limited funds. Yet, they get solved every day.
How? Are the people who solve them superhuman data scientists who can come up with better ideas in five minutes than most people can in a lifetime? Are they magicians of applied math who can cobble together millions of lines of code for high-performance machine learning in a few hours? No. Many of them are incredibly smart, but meeting big problems head-on usually isn’t the winning approach. There’s a method to solving data problems that avoids the big, heavyweight solution, and instead, concentrates building something quickly and iterating. Smart data scientists don’t just solve big, hard problems; they also have an instinct for making big problems small.
We call this Data Jujitsu: the art of using multiple data elements in clever ways to solve iterative problems that, when combined, solve a data problem that might otherwise be intractable. It’s related to Wikipedia’s definition of the ancient martial art of jujitsu: “the art or technique of manipulating the opponent’s force against himself rather than confronting it with one’s own force.”
How do we apply this idea to data? What is a data problem’s “weight,” and how do we use that weight against itself? These are the questions that we’ll work through in the subsequent sections.
Our children will improve upon the things we're building in ways we can't conceive.
Before you scoff at the pointlessness of yet another social network, web app, or project, remember that we don't always do the research or build the company that is immediately useful or profitable.
How a game-playing robot could help shape the future of mobile testing.
If you try to talk to Jason Huggins about Selenium, he'll probably do to you what he did to us. He'll bring his Arduino-based Angry Birds-playing testing robot to your interview and then he'll relate his invention to the larger problems of mobile application testing and cloud-based testing infrastructure.