Editor’s note: Alice Zheng will be part of the team teaching Large-scale Machine Learning Day at Strata + Hadoop World NYC 2015. Visit the Strata + Hadoop World website for more information on the program.
During my first year in graduate school, I had an epiphany about mathematics that changed my whole perspective about the field. I had chosen to study machine learning, a cross-disciplinary research area that combines elements of computer science, statistics, and numerous subfields of mathematics, such as optimization and linear algebra. It was a lot to take in, and all of us first-year students were struggling to absorb the deluge of new concepts.
One night, I was sitting in the office trying to grok linear algebra. A wonderfully lucid textbook served as my guide: Introduction to Linear Algebra, written by Gilbert Strang. But I just wasn’t getting it. I was looking at various definitions — eigen decomposition, Jordan canonical forms, matrix inversions, etc. — and I thought, “Why?” Why does everything look so weird? Why is the inverse defined this way? Come to think of it, why are any of the matrix operations defined the way they are?
While staring at a hopeless wall of symbols, a flash of lightning went off in my mind. I had an insight: math is a design. Prior to that moment, I had approached mathematics as if it were universal truth: transcendent in its perfection, almost unknowable by mere mortals. But on that night, I realized that mathematics is a human-constructed tool. Math is designed, just like software programs are designed, and using many of the same design principles. These principles may not be apparent, but they are comprehensible. In that moment, mathematics went from being unknowable to reasonable.
Mathematics is a system of objects, operations, and shorthand representations. It is designed to model real-world phenomena. Like all designs, there are certain degrees of freedom. The system could have been constructed in one way, or another. A matrix could have been designed as a round ball, in polar coordinates. It doesn’t matter, as long as the operations are consistent; it’s just a shorthand. At some point, someone made those design decisions. They picked the objects and the operations, and laid down rules of organization. Based on these fundamental decisions — if they are designed well — a number of other useful, provable properties then follow, and the whole thing can be used to model the things that we experience in the real world: the way that a tossed ball travels through space, the way sound waves dash across the ether, the rise and fall of stock prices. Physical reality contains layer upon layer of complexity. Well-designed mathematical systems offer clean and concise tools to represent physical reality at every layer.
Linear algebra is designed to represent systems of linear equations. Linear equations are designed to represent linear relationships, where one entity is written to be a sum of multiples of other entities. In the shorthand of linear algebra, a linear relationship is represented as a linear operator — a matrix. Linear operators are made to be simple, so that their effects can be completely analyzed. They can do two things: rotation and scaling. Here, at the border of algebra and geometry, a bit of magic happens. The algebraic operations of multiplication and addition translate to rotation and scaling of vectors in vector space. This allows us to analyze the geometric effects of a linear operator using algebra, breaking down the matrix into its constituent parts: how much rotation, how much stretching or compression, and in which directions. The construction of linear algebra serves as an example of a pattern that is common throughout mathematics: out of a few objects and a set of constrained operations, there arise powerful properties that enable better understanding of the structure of the problem as well as efficient solutions.
The design of mathematics encompasses a number of other principles that are also present in software engineering. Take abstract algebra, for instance. Abstract algebra is essentially an exercise in object hierarchy design, where the goal is to use as few ingredients as possible, adding one more ingredient at a time, to see what kinds of interesting and useful constructs we can get. A group is defined as a set of elements together with an operation (which has to satisfy a few conditions to ensure that its behavior isn’t too weird). A ring is a special type of group endowed with two operations that can be considered as generalizations of addition and multiplication. A field is a specialized ring with four operations (generalizations of addition, subtraction, multiplication, and division). This should immediately start to sound familiar to a software engineer: it’s a hierarchy of objects, where a Field inherits from a Ring which inherits from a Group!
Let’s now take a look at the real number system. This is another example of hierarchical object design, but with some interesting twists in the details. We start with the natural numbers, which is a natural extension of our fingers. Next, we add their mirror image across the zero-divide — the negative numbers. This gives us the integers, which, when combined with the operations of addition and multiplication, form a ring as defined above. Throw in the operations of multiplication and division, and we get the rational numbers, which is now a field. The story might end here, and we’d all be happy: we have a bunch of numbers and a bunch of operations, we can apply those operations to those numbers and still end up with the same set of numbers. Hooray! Unfortunately, at this point our geometer neighbor knocks on our door and asks, “What about the area of a circle or the hypotenuse of a right triangle? Those don’t seem to be a ratio between any two integers.”
This discovery opens up the flood gate of irrational numbers and throws a monkey wrench into our neat and orderly design. Both the rationals and irrationals are useful numbers, and it would be nice to have a representation that can handle both. But they are such different beasts, all due to a pesky notion called infinity: the irrationals are infinite in length and thus prove to be much more tricky to pin down. The rationals are orderly and countable, whereas there are uncountably as many irrationals. It took us a few thousand years to figure out a solution. Right now, our best proposal is to represent everything as a limit: every real number can be thought of as an equivalence class of sequences that have the same destination at infinity. This structure is hierarchical, and most of the elements are in fact non-computable by a finite Turing machine. In the end, one might say that there is nothing real about the real numbers — it’s all a construction!
These are but a few examples of mathematical design at work. Our culture instills the strange notion that “math is hard.” Math is seen as too abstract, too impenetrable, too difficult to digest and impossible to know. But from an alternative perspective, mathematics contains striking parallels with software engineering. Both disciplines are heavy on jargon and notation. But once we parse through the jargon, we can begin to see the flesh and bones of mathematics. Understanding the design principles within mathematics provides us with an inlet into this strange land of hierarchical objects and changing representations. By becoming more familiar with the landscape of mathematics, we can help with the cross pollination of ideas between mathematics and software engineering. Maybe we can even begin to make modifications and come up with new designs of mathematics. Hey, that real number system is getting pretty old and cumbersome. Ready for something new?