Amazon announced a new SLA for EC2, similar to the one for S3. This is a notable step for Amazon and cloud computing as a whole, as it establishes a new bar for utility computing services.
Amazon is committing to 99.95% availability for the EC2 service on a yearly basis, which corresponds to approximately four hours and twenty three minutes of downtime per year. It’s important to remember that an SLA is just a contract that provides a commitment to a certain level of performance and some form of compensation when a provider fails to meet it.
Here’s the summary of the EC2 SLA (emphasis added):
AWS will use commercially reasonable efforts to make Amazon EC2 available with an Annual Uptime Percentage (defined below) of at least 99.95% during the Service Year. In the event Amazon EC2 does not meet the Annual Uptime Percentage commitment, you will be eligible to receive a Service Credit as described below. [...]
- “Annual Uptime Percentage” is calculated by subtracting from 100% the percentage of 5 minute periods during the Service Year in which Amazon EC2 was in the state of “Region Unavailable.” If you have been using Amazon EC2 for less than 365 days, your Service Year is still the preceding 365 days but any days prior to your use of the service will be deemed to have had 100% Region Availability [...]
- “Unavailable” means that all of your running instances have no external connectivity during a five minute period and you are unable to launch replacement instances. [...]
To receive a Service Credit, you must submit a request by sending an e-mail message to aws-sla-request @ amazon.com. To be eligible, the credit request must [...] include your server request logs that document the errors and corroborate your claimed outage (any confidential or sensitive information in these logs should be removed or replaced with asterisks)
This new SLA does not appear to address the reliability of server instances individually or in aggregate. For example, if half of a customer’s EC2 instances lose their connections or die every 6 minutes, EC2 would still be considered “available” even if it is essentially unusable.
If the entire EC2 service is down a cumulative four hours and twenty minutes, customers must furnish proof of the outage to Amazon to be eligible for the 10% credit. This seems like an onerous process for very little compensation, and isn’t in-line with Amazon’s famous “Relentless Customer Obsession”. Amazon takes monitoring very seriously and should take the lead by tracking, reporting, and proactively compensating customers when it lets them down.