Striking parallels between mathematics and software engineering

Becoming more familiar with mathematics will help cross pollinate ideas between mathematics and software engineering.

Mathematics_Tom_Brown_Flickr

Editor’s note: Alice Zheng will be part of the team teaching Large-scale Machine Learning Day at Strata + Hadoop World in San Jose. Visit the Strata + Hadoop World website for more information on the program.

During my first year in graduate school, I had an epiphany about mathematics that changed my whole perspective about the field. I had chosen to study machine learning, a cross-disciplinary research area that combines elements of computer science, statistics, and numerous subfields of mathematics, such as optimization and linear algebra. It was a lot to take in, and all of us first-year students were struggling to absorb the deluge of new concepts.

One night, I was sitting in the office trying to grok linear algebra. A wonderfully lucid textbook served as my guide: Introduction to Linear Algebra, written by Gilbert Strang. But I just wasn’t getting it. I was looking at various definitions — eigen decomposition, Jordan canonical forms, matrix inversions, etc. — and I thought, “Why?” Why does everything look so weird? Why is the inverse defined this way? Come to think of it, why are any of the matrix operations defined the way they are?

While staring at a hopeless wall of symbols, a flash of lightning went off in my mind. I had an insight: math is a design. Prior to that moment, I had approached mathematics as if it were universal truth: transcendent in its perfection, almost unknowable by mere mortals. But on that night, I realized that mathematics is a human-constructed tool. Math is designed, just like software programs are designed, and using many of the same design principles. These principles may not be apparent, but they are comprehensible. In that moment, mathematics went from being unknowable to reasonable. Read more…

Comments: 29

Beyond lab folklore and mythology

What the future of science will look like if we’re bold enough to look beyond centuries-old models.

Chemistry_Set_Alejandro_Hernandez_Flickr

Editor’s note: this post is part of our ongoing investigation into synthetic biology and bioengineering. For more on these areas, download the latest free edition of BioCoder.

Over the last six months, I’ve had a number of conversations about lab practice. In one, Tim Gardner of Riffyn told me about a gene transformation experiment he did in grad school. As he was new to the lab, he asked two more experienced scientists for their protocol: one said it must be done exactly at 42 C for 45 seconds, the other said exactly 37 C for 90 seconds. When he ran the experiment, Tim discovered that the temperature actually didn’t matter much. A broad range of temperatures and times would work.

In an unrelated conversation, DJ Kleinbaum of Emerald Cloud Lab told me about students who would only use their “lucky machine” in their work. Why, given a choice of lab equipment, did one of two apparently identical machines give “good” results for a some experiment, while the other one didn’t? Nobody knew. Perhaps it is the tubing that connects the machine to the rest of the experiment; perhaps it is some valve somewhere; perhaps it is some quirk of the machine’s calibration.

The more people I talked to, the more stories I heard: labs where the experimental protocols weren’t written down, but were handed down from mentor to student. Labs where there was a shared common knowledge of how to do things, but where that shared culture never made it outside, not even to the lab down the hall. There’s no need to write it down or publish stuff that’s “obvious” or that “everyone knows.” As someone who is more familiar with literature than with biology labs, this behavior was immediately recognizable: we’re in the land of mythology, not science. Each lab has its own ritualized behavior that “works.” Whether it’s protocols, lucky machines, or common knowledge that’s picked up by every student in the lab (but which might not be the same from lab to lab), the process of doing science is an odd mixture of rigor and folklore. Everybody knows that you use 42 C for 45 seconds, but nobody really knows why. It’s just what you do.

Despite all of this, we’ve gotten fairly good at doing science. But to get even better, we have to go beyond mythology and folklore. And getting beyond folklore requires change: changes in how we record data, changes in how we describe experiments, and perhaps most importantly, changes in how we publish results. Read more…

Comments: 5

Understanding the blockchain

We must be prepared for the blockchain’s promise to become a new development environment.

Editor’s note: this post originally published on the author’s website in three pieces: “The Blockchain is the New Database, Get Ready to Rewrite Everything,” “Blockchain Apps: Moving from the Jungle to the Zoo,” and “It’s Too Early to Judge Network Effects in Bitcoin and the Blockchain.” He has revised and adapted those pieces for this post.

There is no doubt that we are moving from a single cryptocurrency focus (bitcoin) to a variety of cryptocurrency-based applications built on top of the blockchain.

This article examines the impact of the blockchain on developers, the segmentation of blockchain applications, and the network effects factors affecting bitcoin and blockchains.

The blockchain is the new database — get ready to rewrite everything

The technology concept behind the blockchain is similar to that of a database, except that the way you interact with that database is different.

For developers, the blockchain concept represents a paradigm shift in how software engineers will write software applications in the future, and it is one of the key concepts that needs to be well understood. We need to really understand five key concepts, and how they interrelate to one another in the context of this new computing paradigm that is unravelling in front of us: the blockchain, decentralized consensus, trusted computing, smart contracts, and proof of work/stake. This computing paradigm is important because it is a catalyst for the creation of decentralized applications, a next-step evolution from distributed computing architectural constructs. Read more…

Comment

DevOps keeps it cool with ICE

How inclusivity, complexity, and empathy are shaping DevOps.

ice

Over the next five years, three ideas will be central to DevOps: the need for the DevOps community to become more Inclusive; the realization that increasing Complexity of systems is the underlying reason for DevOps; and the critical role of Empathy in the growth and adoption of DevOps. Channeling John Willis, I’ll coin my own DevOps acronym, ICE, which is shorthand for Inclusivity, Complexity, Empathy.

Inclusivity

There is a major expansion of the DevOps community underway, and it’s taking DevOps far beyond its roots in agile systems administration at “unicorn” companies (e.g., Etsy or Netflix). For instance, a significant majority (80-90%) of participants at the Ghent conference were first-time attendees, and this was also the case for many of the devopsdays in 2014 (NYC, Chicago, Minneapolis, Pittsburgh, and others). Moreover, although areas outside development and operations were still underrepresented, there was a more even split between developers and operations folks than at previous events. It’s also not an accident that the DevOps Enterprise conference took place the week prior to the fifth anniversary devopsdays and included talks about the DevOps journeys at large “traditional” organizations like Blackboard, Disney, GE, Macy’s, Nordstrom, Raytheon, Target, UK.gov, US DHS, and many others.

The DevOps community has always been open and inclusive, and that’s one of the reasons why in the five years since the word “DevOps” was coined, no single, widely accepted definition or practice has emerged. The lack of definition is more of a blessing than a curse, as DevOps continues to be an open conversation about ways of making our organizations better. Within the DevOps community, old-time practitioners and “newbies” have much to learn from each other.

Read more…

Comment: 1

Announcing BioCoder issue 6

BioCoder 6: iGEM's first Giant Jamboree, an update from the #ScienceHack Hack-a-thon, the Open qPCR project, and more.

Today, we’ve released the 6th issue of BioCoder. There’s a lot of great content, including a report from iGEM’s first Giant Jamboree, and an update from the #ScienceHack Hack-a-thon. We’ve also got a report on the Open qPCR project, which reduces the cost of real-time PCR by a factor of 10, and an article about bringing microfluidics into the DIY lab. There’s nothing more disruptive than taking exotic and expensive techniques and putting them in the hands of experimenters.

Once again, we’re interested in your ideas and in new content, so if you have an article or a proposal for an article, send it in to BioCoder@oreilly.com. We’re very interested in what you’re doing. There are many, many fascinating projects that aren’t getting media attention. We’d like to shine some light on those. If you’re running one of them — or if you know of one, and would like to hear more about it — let us know. We’d also like to hear more about exciting start-ups. Who do you know that’s doing something amazing? And if it’s you, don’t be shy: tell us.

Above all, don’t hesitate to spread the word. BioCoder was meant to be shared. Our goal with BioCoder is to be the nervous system for a large and diverse but poorly connected community. We’re making progress, but we need you to help make the connections.

Comment

A brief look at data science’s past and future

In this O'Reilly Data Show Podcast: DJ Patil weighs in on a wide range of topics in data science and big data.

Back in 2008, when we were working on what became one of the first papers on big data technologies, one of our first visits was to LinkedIn’s new “data” team. Many of the members of that team went on to build interesting tools and products, and team manager DJ Patil emerged as one of the best-known data scientists. I recently sat down with Patil to talk about his new ebook (written with Hilary Mason) and other topics in data science and big data.

Subscribe to the O’Reilly Data Show Podcast

iTunes, SoundCloud, RSS

Here are a few of the topics we touched on:

Proliferation of programs for training and certifying data scientists

Patil and I are both ex-academics who learned learned “data science” in industry. In fact, up until a few years ago one acquired data science skills via “on-the-job training.” But a new job title that catches on usually leads to an explosion of programs (I was around when master’s programs in financial engineering took off). Are these programs the right way to acquire the necessary skills? Read more…

Comment