Amazon's S3 Gets An SLA

Amazon Web Services now has an SLA. S3 (storage) and EC2 (virtual servers) have been adopted by many web companies to save money, but until now they’ve just had to trust that the services would stay up (Radar post). Jeff Barr made the SLA announcement on the AWS blog:

I am very happy to announce that, effective October 1, 2007, The Amazon S3 Service Level Agreement is in effect.

This SLA has been in the works for a while and we take the commitments made in this document quite seriously. We knew that S3 had to meet the very high performance and reliability goals set by our internal clients. We strongly believed that meeting this level of operational excellence would be good enough for our external users as well. Before we published our SLA, we wanted to get a better sense of how our external developers were making use of S3. With well over 5 billion objects under management, we now understand the usage patterns and properties needed to make an informed commitment.

You can read the entire document to see how this will work. Basically, we commit to 99.9% uptime, measured on a monthly basis. If an S3 call fails (by returning a ServiceUnavailable or InternalError result) this counts against the uptime. If the resulting uptime is less than 99%, you can apply for a service credit of 25% of your total S3 charges for the month. If the uptime is 99% but less than 99.9%, you can apply for a service credit of 10% of your S3 charges.

The addition of an SLA for S3 is good news for companies that no longer wish to deal with their own physical infrastructure. EC2 is still in Beta, so it is not surprising that they have not added an SLA for that service (though hopefully it will come). However, S3 is still missing some features that Artur requested previously:

Change the T&C to at least promise to give paying customers a notice of a certain amount of days if they choose to shut the service down.

Publish their current uptime and availability to their customers.

Show you how many copies of a file exists, and how quickly a file uploaded to them becomes redundant.

Missing features or not this was a great move for Amazon and one that is surely to increase their sales.

Update: I missed the section in Amazon’s SLA where they talk about terminations. Amazon pointed this out and included the appropriate clause:

I saw your post this morning and noticed what I see as an inaccuracy.
You say at the bottom that AWS hasn’t changed it’s T&Cs to give
customers notice. That is incorrect–the updated agreement changed
this. Here’s a link to the agreement:
http://www.amazon.com/gp/browse.html?node=3440661, and the relevant text
pasted here:

“3.3.2. Paid Services (other than Amazon FPS). We may suspend your right
and license to use any or all Paid Services (and any associated Amazon
Properties) other than Amazon FPS, or terminate this Agreement in its
entirety (and, accordingly, cease providing all Services to you), for
any reason or for no reason, at our discretion at any time by providing
you sixty (60) days’ advance notice in accordance with the notice
provisions set forth in Section 15 below.

tags: ,

Get the O’Reilly Programming Newsletter

Weekly insight from industry insiders. Plus exclusive content and offers.

  • http://www.uptill3.com Adam

    This is great news – we have been investigating the use of S3/EC2 for a new web application, but were very hesitant to depend on anything without even a modest SLA.

    I wish they would add some terms in the SLA about notification of plans to shutdown the service for any reason (i.e. provide 3 months notice, to assist in migration away should they have to shutdown for some reason). This was the other major complaint against the service set, although I guess it applies to most any service provider.

  • http://swardley.blogspot.com Simon Wardley

    It’s a start, but overdue. Still it is a good move and at least they are catching up with something that xcalibre.co.uk has offered with flexiscale.

    Hopefully we will see in the future Amazon either getting behind the OVF (open virtual machine format) or some equivalent and a future where there are multiple providers of the same standard that we can switch services between without lock-in (either direct or indirect because of lack of alternative) and exit costs.

    Though they are unlikely to shut down the service, the risks still remain until there are multiple providers of the same standard.

  • http://www.panttaja.com/jim Jim Panttaja

    This is progress – but the SLA does reference section 7.1 of the Amazon Web Services‚Ñ¢ Customer Agreement. And this includes the following clause: “your access to and use of the Services may be suspended for the duration of any unanticipated or unscheduled downtime or unavailability of any portion or all of the Services for any reason, including as a result of power outages, system failures or other interruptions”.

    http://www.amazon.com/gp/browse.html?node=3440661#7

    That clause makes achieving 99.9% uptime easy to achieve – and not very meaningful.

  • Nick

    Nirvanix has an SLA,now Amazon has an SLA.I smell an obvious move by Amazon to win back some customers from Niranix…

  • http://blogs.smugmug.com/don/ Don MacAskill

    @Brady:

    Artur has some good points, but the last one is a little bogus. No writes are confirmed at S3 until the data has been written redundantly. In other words, “how quickly a file uploaded to them becomes redundant” is zero [insert favorite time measurement unit here]. :)

    @Nick:

    I have a hard time believing anyone really uses Nirvanix anyway. Their company has a terrible track record, they’re more expensive than Amazon, and besides… what good is an SLA without a reputation and finances behind it? Amazon has those, but Nirvanix doesn’t.

  • Erik Paulson

    The whole point of S3 is that you’re not paying much for storage, so refunding a little bit of that price isn’t much comfort when S3 goes down.

    Let’s pretend you’ve got a $10,000 a month S3 bill, which would be a very large S3 user. If S3 goes down for about 7 hours in a month, that’s still just a bit above 99% uptime, so Amazon refunds you $1000.
    If your site was down for nearly 8 hours, does $1000 come close to making up for the lost revenue from your site?

    What S3 really needs is a serious competitor that people trust, and sites should use both of them. The storage cost is so cheap that paying for the storage twice is no big deal. If you’ve got a mostly read workload, your monthly bandwidth costs are the same, and probably what you’re spending the most on anyway. If S3 goes down, no big deal, just shift all the load over to Google/Microsoft/Sun/Whoever. Ideally each provider would have basically the same interface to make it easier to use both at the same time.

    Maybe Nirvanix is the that competitor, maybe we have to wait for someone big to jump in, but two options are worth way more than one option with an SLA.

  • Nick

    Don

    Nirvanix is well funded and was the first to have an SLA, unlike Amazon, who now feels the pressure to do so.Amazon has had documented outages and does not offer any customer support what so ever, they talk to their customers through a message board. This is a move for Amazon to save face against Nirvanix.

  • Nick

    http://www.datacenterknowledge.com/archives/2007/Oct/02/amazon_ec2_outage_wipes_out_data.html

    Just came across this story, just some facts to back up my claims…

  • http://www.3tera.com/hotcluster.html Bert Armijo

    Erik,

    You’re correct that a credit of 10%, or even 25%, won’t compensate most production users for 7 hours of downtime, but IMHO it isn’t supposed to. Commodity services don’t have the markup to reimburse for consequential damages. In considering an SLA for 3tera’s AppLogic service our objective is to align our financial objectives with our users. Simply put, we shouldn’t profit if our users aren’t.

    I believe Amazon’s SLA meet that objective.

  • http://blogs.smugmug.com/don/ Don MacAskill

    @Nick: Webvan was well-funded too.

  • Anonymous

    Don:

    You know that, I know that, I know how many copies they store. That is still not public information.

    Or wasn’t last time I checked.

    SLAs aren’t there to compensate you fully, the compensation part is irrelevant really. The important part is the commitment and measurement around it.

  • http://www.webmetrics.com/ecosystemmonitorAWS.html Arthur Meadows

    Webmetrics provides availability & performance monitoring of Amazon AWS.

    For performance over the last hour, you can embed this widget in any webpage / Vista sidebar: http://www.webmetrics.com/widgetdir_amazon.html.

    For more comprehensive performance info, please contact us.

  • http://www.nirvanix.com Patrick Harr

    Don,
    I congratulate Amazon, as I have done publicly multiple times, for their success with S3 and highlighting the value of storage as a service model.

    I do believe, however, like with any initial service in the market that there is opportunity for others to deliver a better service that differentiates in many areas and offers choices that best fits the application needs and requirements. It is the reason why you created and started SmugMug when there were clearly many other services available to store and access photos.

    Nirvanix has created a storage delivery service from the ground up that utilizes the latest in clustered file system technology with our own patent-pending Internet Media File System to deliver the best performance and scalability for media-rich applications and services. In addition, we have launched new transcoding, image manipulation and streaming services that dramatically reduce time to market and make it much easier to build and offer new rich media applications and services. Finally, we are building our storage delivery network with nodes in Europe, Asia and the United States that will deliver the optimal user experience for users of services like SmugMug – not just from two locations in the US as S3 does.

    As far as financial strength and credibility, I’d like to address a few points. One, we are backed by leading venture capital firms with rich history in CDNs and storage. The firms have deep pockets for follow on investment in Nirvanix. They are certainly more than happy to talk with anyone that may have a question along those lines. And, just as important, there are several additional firms and strategic company investors, including much larger companies than Amazon, that have expressed more than just interest to invest in Nirvanix. Stay tuned.

    Finally, Nirvanix has much greater availability than Amazon as we use RAID 6 protection and complete backup of every system. Nirvanix has never been down nor have we ever lost any data, even during beta like EC2. We are designed for the highest levels of availability. Just as important to design, one needs to trust the operations management behind it. I just hired on my team Michael Landesman as head of my data center operations and network. Michael was previously responsible for building out and managing Exodus operations in Southern California before moving on to run operations at Savvis and Rackspace. Michael has overseen and managed the most demanding accounts and environments in the world and knows a thing or two about SLAs. In fact, Michael managed multi-million dollar accounts and their 67 page SLAs from the largest software company in the world up there in Seattle.

    Rather than attempt to tear down our company for simply innovating in an area that provides tremendous value for customers, I would suggest focusing on the value of the storage-as-a-service model and its benefits over build-it-yourself, Internet-scale storage infrastructures. I also think it is important to encourage competition in the market as things like the limited SLA just introduced by S3 would not occur without companies like Nirvanix in the storage-as-a-service space. Think about how dearth the photo space would be without innovative companies like SmugMug?

    Regards,
    Patrick Harr
    CEO
    Nirvanix