Pick your auto-scaling metrics wisely
A while back, I wrote up some notes on the use of feedback control in auto-scaling server instances in a data center. Afterwards, a reader contacted me to ask whether the article didn’t “boil down to ‘pick your auto-scaling metrics wisely’?”
That’s exactly right!
Feedback control indeed “boils down” to picking the appropriate metric when making decisions. The whole idea behind feedback control is to base actions on the actual behavior (the “output”) of the system, and not merely on its operating conditions (the “input”). Conceptually, that’s all there is to it.
This is nevertheless a big deal, for two reasons:
Favoring behavior over environment
In a series of posts (Part 1, Part 2, Part 3, Part 4, Part 5, and Part 6), we have introduced the idea of feedback control as a way to keep complex systems on track, even when subject to uncertainty and change.
It is easy to be confused at this point, and to think that feedback is nothing more than an “adaptive system” that can modify its behavior in response to changes in its environment. But that would not be right. It depends on what quantity you are monitoring! A feedback system does not respond to changes in the environment—a feedback system changes specifically in response to changes in its own behavior. That’s a big difference.
Choosing appropriate values
In the last post, we introduced the PID controller for use in feedback loops:
or, in a discrete-time software implementation:
sum += error
output = kp * error + DT * ki * sum + kd * (error - prev) / DT
prev = error
We also mentioned that the controller “gains” kp, ki, and kd are used to adapt the controller to the specifics of its operating environment. As an example we used a cache: its output (the metric that we want to control) is a value in the range of 0.0 to 1.0, but its input (the quantity that the controller needs to calculate) is a, possibly large, integer. We must choose appropriate values for the controller gains to adapt these numerical ranges.
Exploring the PID controller
In the previous parts of this series (Part 1, Part 2, Part 3, and Part 4), we introduced feedback as a design principle or paradigm, that can help to keep systems “on track”, even in the presence of uncertainty and change. In this post, we will begin to explore more closely what this all means in practice.
Consider the feedback loop shown in the Figure. The controlled system is a cache, and we have a controller that adjusts the size of the cache in order to maintain a desired cache hit rate. (Making the cache larger will result in a greater number of hits and hence will drive the hit rate up.) We also have a desired value for the hit rate as reference value or “setpoint” (supplied on the left). The tracking error is calculated as the difference between setpoint and actual hit rate and is provided as input to the controller.
Doing away with heuristic constants
In our last post, we pointed out that feedback is different from common algorithmic thinking. In the current post, we will discuss these differences in more detail.
Typical algorithms tend to be deterministic, and are grounded in the assumption that all possible outcomes can be enumerated: “Take the middle element in the array. If it is less than or equal to the pivot, do this; otherwise, do that.” Control systems built with this mindset tend to be rule-based and heuristic: “Every day at 10am, spin up 15 more servers, then take them down again at 4pm.”
Such systems tend to suffer from two problems, both of which stem from the same underlying cause, namely the fact that the controller is fixed (deterministic) and does not take the actual state of the system into account.
Maintaining a desired behavior
In two previous posts (Part 1 and Part 2) we introduced the idea of feedback control. The basic idea is that we can keep a system (any system!) on track, by constantly monitoring its actual behavior, so that we can apply corrective actions to the system’s input, to “nudge” it back on target, if it ever begins to go astray.
This begs the question: Why should we, as programmers, software engineers, and system administrator care? What’s in it for us?
Gracefully maintain a desired value in the presence of uncertainty and change
In a previous post, we introduced the basic feedback concept. Now it is time to take a closer look at this idea.
Feedback is a method to keep systems on track. In other words, feedback is a way to make sure a system behaves in the desired fashion. If we have some quality-of-service metric in mind, then feedback is a reliable method to ensure that our system will achieve and maintain the desired value of this metric, even in the presence of uncertainty and change.
Getting this balance just right
Feedback is the very simple idea that you can control a complex system through the constant application of small corrections, which are applied to “nudge” the system towards its ideal operating point.
This idea is at the same time obvious and strangely at odds with common practice—and for good reason. On the one hand, it is obvious: monitor what the system is doing, and then keep nudging it back if it is beginning to go astray. That’s what human operators do: if you are driving your car, you constantly apply small steering corrections to keep the car on the road, and you keep adjusting the speed, in relation to the vehicle ahead of you and the speed limit.