In the following interview, PalominoDB owner and CEO Laine Campbell discusses advantages and disadvantages of using Amazon Web Services (AWS) as a platform for running MySQL. The solution provides a functional environment for young startups who can’t afford a database administrator (DBA), Campbell says, but there are drawbacks to be aware of, such as a lack of access to your database’s file system, and troubleshooting “can get quite hairy.” This interview is a sneak preview to Campbell’s upcoming Velocity session, “Using Amazon Web Services for MySQL at Scale.”
Why is AWS a good platform for scaling MySQL?
Laine Campbell: The elasticity of Amazon’s cloud service is key to scaling on most tiers in an application’s infrastructure, and this is true with MySQL as well. Concurrency is a recurring pattern with MySQL’s scaling capabilities, and as traffic and concurrent queries grow, one has to introduce some fairly traditional scaling patterns. One such pattern is adding replicas to distribute read I/O and reduce contention and concurrency, which is easy to do with rapid deployment of new instances and Elastic Block Storage (EBS) snapshots.
Additionally, sharding can be done with less impact via EBS snapshots being used to recreate the dataset, and then data that is not part of the new shard is removed. Amazon’s relational database service for MySQL—RDS—is also a new, rather compelling scaling pattern for the early stages of a company’s life, when resources are scarce and administrators have not been hired. RDS is a great pattern for people to emulate in terms of rapid deployment of replicas, ease of master failovers, and the ability to easily redeploy hosts when errors occur, rather than spending extensive time trying to repair or clean up data.
What is the advantage of a hosted environment, of opting to use a database-as-a-service (DBaaS) option?
Laine Campbell: As I mentioned, one of the key advantages of database-as-a-service is elimination of the commoditized components of database administration. This tends to be provisioning, backup/recovery and basic infrastructure setups around failovers. In early stages, startups may be cash strapped and and unable to afford the required compensation for an experienced DBA. There still are non-commoditized components of database architecture to consider: SQL tuning, data modeling, architectural considerations around scale, data management, and more. These still must be taken into account, but the DBaaS can allow companies to use senior consultants for that, without having to distract them with the more commoditized components.
There are very interesting financial maneuvers that can be done by using hosted and service-based approaches as well. An AWS service comes out of operational expenses, not capital, which can increase financial options tremendously early on. It’s all a matter of scale. When a startup is young, buying power is not leveraged by huge purchases. A company can find the costs of sysadmins, hardware, facilities, network, and DBAs to be out of reach, or at least daunting. As they grow, those same companies may find that these same options prove too expensive, but even that is a generalization, and many large companies with tremendous spending power prefer the elasticity of cloud-based services. This is highly dependent on the business model, of course. Companies whose products or services rely on cycles (such as education, retail, and marketing) may need huge amounts of resources for short periods of time. In a traditional hardware model, that hardware will sit idle most of the year. In a cloud model, one can simply decommission unneeded resources and prepare for the next big surge.
What are the options for running MySQL on AWS, and their advantages and disadvantages?
Laine Campbell: MySQL can be run on Elastic Compute Cloud (EC2) and EBS for persistent storage or as a service with RDS. RDS eliminates a lot of DBA tasks and provides a consistent, stable environment and ease of use for many regular operations. This can be quite nice in small shops without DBAs. There are disadvantages, however. For example, you are not given the ability to do advanced topologies that might be used for scaling purposes or for rolling migrations and creative downtime elimination measures. Examples of this are tiered replication and selective replication. You also do not have access to your database’s file system, which means if you need to get data in and out of RDS, you are limited to mysqldump or custom, parallelized extract, transform and load (ETL) jobs that can be very costly in terms of time for extraction and loading, and custom development time to minimize downtime during a migration.
Troubleshooting issues can get quite hairy as well. You don’t have access to the operating system; to network cards; or to utilities such as top, sar, vmstat, and much more. Your access to logs through APIs eliminates the use of command line file manipulation utilities such as sed, awk and grep, as well as more custom tools such as pt-query-digest or Palomino’s SQLCanon for SQL logs. Additionally, RDS is not cross region, meaning you must stick to one AWS region, and thus your DB tier is limited, practically, to a 99.5% SLA, which means your overall system must be architected very well to keep a higher SLA or you can expect lower aggregate SLAs for your entire application’s uptime.
EC2 gives you much more flexibility. For instance, you can do rolling upgrades via tiered replication. You have access to the OS; to disk utilities; to SAR; and to network data via tee, tcpdump, and more. This means your DBAs are able to identify problems much faster and be more confident in outages and incidents. You can set up cross-region replication, and you don’t have to worry about all of the “gotchas” that come from not having privileged access to MySQL or the OS. Sounds great right? The disadvantage is that you have to code your own provisioning, backups, and failover. This is not rocket science, of course, and can be done in Chef, Puppet, Ansible, or your own scripting. But it does take resources, practice, time and focus away from feature development and more strategic work. That being said, it’s a large initial effort, but then you have an environment you control. That can be quite compelling.
How big of an issue is managing costs in AWS, and how would you advise those considering this option?
Laine Campbell: It’s a tremendous issue. People are spinning up instances for testing, and then they forget to remove them. Over-allocating provisioned IOPs can be a huge waste of cost. When people are working on their own iron, the money has been spent. They have to think about capacity, but they don’t need to be as aware of it as someone using monthly cost, operationally expensed infrastructures. Not only do you have to worry about not enough capacity, you have to make sure you’re not so over capacity that you are wasting money, and you have to have the ability and process to scale down as well as up.
One ridiculously nice feature in an elastic cloud is overallocation. Let’s say you have a new release. Marketing can’t give you info on traffic or numbers; devs can’t give you SQL to review because it’s in an ORM and you’re only being told about the release the day before because you have to do some table ALTERs. What do you do? Well, the nice thing is you can launch five more replicas, up your PIOPS, and pray it’s enough. You’d never be able to do that with a hosted provider who was giving you real iron, or your own datacenter. But, once the release is done, you’ve tuned the bad SQL, the marketing people admit they overestimated the number of hits, you’ve had a nap, and you need to remember to scale that stuff back down.
You also need to plan things out over a period of years. Reserved instances save you huge amounts of money, but they can lock you into having instances longer than you need them. This is particularly true in RDS. In EC2, you can at least use a reserved instance for something other than MySQL if you find yourself needing fewer database instances after major tuning and re-architecture. With an understanding of the marketplace with AWS, you can sell reserved instances. And finally, you need to consider spot instances and the savings involved there for non-critical systems such as development, QA, and stage. It’s a lot!
What superpower will people have after attending your session?
Laine Campbell: The ability to end any conversation via one awkward sentence. OK, really, that’s my super power, but it may very well be transmitted via osmosis. I’d say the superpower will be focused on clairvoyance—knowing many of the risks and choices that will lie ahead of them in their wonderful, magical journeys running MySQL in AWS. And the awkward thing; that’s really not a joke.
This interview was edited and condensed.