Mon

Apr 30
2007

Tim O'Reilly

Tim O'Reilly

We'd Love To Hear Your S3 Stories...And Numbers

SmugMug has told the world how much money they've saved using S3, and at the Web 2.0 Expo, Jeff Bezos told us how his aerospace company, Blue Origin spent only $304 using S3 to deliver half a million copies of the video of their test launch. Jeff also mentioned that S3 is now hosting 5 billion data items for its customers. He also told us that on S3's peak day, there were 920 million requests for getting or putting objects. The peak second had 16,000 requests. But he didn't tell us how much storage space those 5 billion objects take up (which would also tell us how much money S3 is taking in at its 15 cents per gigabyte per month) or many other things that inquiring minds would like to know.

We believe that Amazon's S3 and EC2 are shaping up to be really important. We're hearing anecdotally from a lot of people who are using these services, but we'd love to gather more data. In particular, we have a feature in our Release 2.0 Newsletter called "The Number", in which we do a quantitative drill down on some aspect of new technology. In this month's issue, we explore the controversy about just how much activity there is in Second Life; in the next, we'd love to explore the uptake of S3, and just what it means for both Amazon and the rest of us.

And to do that, we need your help. We'd love to hear from anyone who is using S3 and EC2, but we're especially interested in hearing from folks who are willing to share their numbers and to work with us on analyzing them. Let us know in the comments or send mail to jimmy at guterman.com. (Jimmy is the editor of the Release 2.0 newsletter.)


tags: web 2.0  | comments: 29   | Sphere It
submit:

 
Previous  |  Next

0 TrackBacks

TrackBack URL for this entry: http://blogs.oreilly.com/cgi-bin/mt/mt-t.cgi/5459

Comments: 29

  Reuven Cohen, CTO, Enomaly Inc [04.30.07 07:10 AM]

We've moved roughly ten servers from Ev1 & Rackspace to EC2 saving more then $600 a month.

  Jon Henshaw [04.30.07 08:01 AM]

We use it for BaseJumpr, a Basecamp exporter and activeCollab hosting service.

We use EC2 to run all of the Basecamp exports. With EC2 we can run an unlimited amount of free and fee-based exports, saving us virtually thousands of dollars a month in what would typically be managed hosting costs. We've also automated the process of turning on and off EC2 servers so they're only running when we need them too.

We use S3 to provide unlimited file storage for activeCollab projects. With S3 we're able to give clients gigs and gigs of space for their project files while still maintaining a profit. It provides secure and reliable storage for our clients, while saving us a ton of money in hosting and storage costs.

Through making these tools, we were able to create PHP-AWS (PHP Amazon Web Services), which is open source and hosted on Google Code. The PHP library offers access to S3, EC2, SQS, and MTurk. It makes AWS possibilities almost endless with PHP.

  Josh Williams [04.30.07 09:29 AM]

We use S3 to host all our file downloads for our crazy social-network icon trading site, IconBufffet. We still host the application on our own servers, but offloading the files has saved us an enormous amount of bandwidth and expense.

  Famousr [04.30.07 09:41 AM]

Within the first week of launching Famousr.com, the site got a lot of attention and produced over 2 million pageviews in one day. That amount of bandwidth was crippling my little server, so I set up an S3 account and moved all the images over in about an hour, allowing my site to withstand the flood. Definitely a lifesaver for my site during a critical traffic surge.

  Nathan of Cruxy [04.30.07 10:10 AM]

Aside from pure cost savings per megabyte, S3 really shines in negating the traditional upfront capital investment in server/storage hardware and CDN/network costs. Let's not even take into account worries and staffing costs around backup, uptime, outages, and so on that have (for the most part) stopped being a concern with S3.

When we were implementing Cruxy last summer, cheap bandwidth was available, but setting up a CDN (through Limelight or Akamai for instance) was very expensive, and hosted storage was even more so. We didn't need anything complex or over the top, but just making sure we could scale reliably and affordably was critical to our media-focused site. Just in the nick of time, Amazon S3 provided a solution for all of these problems in one - unlimited, linear-cost storage with no upfront costs, cheap bandwidth, and a lightweight CDN, all wrapped up in a brand we could trust. We've been online for six months now, and have had almost zero downtime and none of us wears a pager - hallelujah!

I know you are looking for detailed numbers and not just anecdotes, so we'll have those for you soon, I just wanted to make sure your analysis was considering the larger view of "cost" than someone like Smugmug may have considered in the past.

Nathan of Cruxy http://cruxy.com

  Don MacAskill [04.30.07 11:25 AM]

Thanks for the link, Tim. My bad for not having that post linked to the updated numbers that I talked about at ETech, but I just fixed that. :)

  Chris Ritke [04.30.07 12:27 PM]

We're using S3 for storing images and files from our social project collaboration tool at 49sparks.com . It is so cool that we basically only have to worry about text and metadata stored on our own servers. I have seen a few very short outages over the past months that have been a bit worrisome - but it seems to be available most of the time - response time is always really snappy. I love it.

  Don MacAskill [04.30.07 01:20 PM]

Oh, and in case you're not into flipping through the ETech stuff, we've saved $1M now and have more than 200TB stored on S3. Still in love.

  Jonathan Yapp [04.30.07 01:27 PM]

We make extensive use of S3 for www.everystockphoto.com, and some of our client projects, mostly for photo and video media. In total we have over 5 million objects being stored and served. Other than dealing with occasional timeouts, we have been extremely happy with s3's speed, availability and pricing. Its fabulous not having to worrying about scaling file storage. Next step: making good use of EC2.

  Ask Bj√∏rn Hansen [04.30.07 02:10 PM]

It seems to me that there's too much focus just on the "save money on disks and bandwidth" benefit.


Over on YellowBot - http://www.yellowbot.com - we needed some "always available" file storage for a feature. We had a few options:


1) Build it ourselves = maximum control & cost of maintaining code

2) Use MogileFS = good control & cost of operating code

3) Use S3 = enough control & "just use a client library"


A service like "take this file and give it back whenever I need it" really should be (and is now, I suppose!) a commodity service.


I was giving a "Scalability tutorial" at the MySQL conference - http://develooper.com/talks/ - last Monday and seeing it sold out reminded me how much work we still have to do to make this stuff common throughout the application stack.


For shared storage of small files it's definitely by S3 for the "80% case". It'll be interesting to see what comes next. The rest of the "scalability toolkit" isn't as easily commoditized (yet)...

- ask

  Avi Flombaum [04.30.07 02:11 PM]

We use S3 and the AWS-Ruby library to store all images and files related to products listed on our social architecture site. While we grow, we know that server storage and bandwidth is one area we don't need to worry about scaling. Long live S3 and Ruby on Rails!

  Sam Penrose [04.30.07 04:29 PM]

Panic software used it to handle a download surge:

http://www.cabel.name/2007/04/coda-one-week-later.html. Summary: "Amazon pretty much saved our e-asses."

(via http://daringfireball.net/)

  Ronen Mizrahi [04.30.07 04:48 PM]

We use S3 to distribute our software (the TVersity Media Server). We use EC2 to host the TVersity web site. We compared it to rackspace and the saving is mind boggling. We save roughly $500 a month for every machine we use on EC2 compared to rackspace, and we save thousands a month on storage since rackspace insisted on a SAN solution when we asked them to propose an alternative.

  Domas Mituzas [04.30.07 04:56 PM]

I posted my cost analysis for Wikipedia at http://dammit.lt/2007/05/01/wikipedia-s3-costs/ - it appears the monthly costs are much lower for us if we run our own media system - and we save most on bandwidth..

  Nic Wolff [04.30.07 06:53 PM]

Presently in beta, Foneshow is storing 500 GB or so of "long tail" audio content at S3, to be pulled back to our telephony server on demand when a user wants to hear it on their phone. We can pull a 5MB WAV back from S3 in a few seconds, while we play a "Here's the next show in that series" prompt for the user.

This has let us build out the Foneshow application with minimal initial investment, while offering a full and diverse catalogue of content. Without S3, we'd have had to put a networked storage solution in place; since we're in managed hosting, this would have been a major out-of-pocket expense. Instead, we're spending in beta just $100 per month or so.

We're experimenting with moving the transcoding process to EC2 - so that content would come from publishers straight to EC2 virtual servers to be processed, then be stored in S3 until requested. This would give us the same instant scalability for handling greater volumes of new content that S3 gives us for archiving old content.

  Christian Beaumont [04.30.07 08:51 PM]

To us here at Digini, S3 is a lot more than saving money on storage and bandwidth it has also saved us countless hours of custom development work to build a fault tolerant globally replicated solution. Even if we could afford the IT infrastructure (which we can't) there is no way we would be able to build anything remotely as robust as S3.

We are currently using S3 to enable our community pipeline in our product. Our application, Blade3D is an XNA based 3D development tool that is currently in beta. However, just because we are developing a somewhat traditional client application we didn’t want to miss out on the benefits the Internet can offer. We wanted to give our customers the ability to easily exchange multi-megabyte assets such as textures, models and other 3D content from within the product. Additionally we wanted the help system built into the product to be dynamically updatable and include tutorial videos and other multimedia presentations since nobody wants to read a manual these days.

Approximately forty percent of our users are using the product from outside the US so it is also important that our content distribution solution works well worldwide. What we found is interesting in that when we make new product releases every couple of weeks, our users in Eastern and Central Europe are able to download just as fast, if not faster than we can here in Seattle. This is fantastic since our internal tests to the same drop sites have terrible bandwidth, especially to Eastern Europe. With this in mind we set up our product update system to use S3 as well.

Finally, we use S3 to host all the images and videos on our website. This is pretty obvious really but it is surprising how much a small change like this helped reduce bandwidth to our rather limited server capacity. We haven’t gone as far as moving the web server itself to EC2 but it is certainly an option we are considering for the future.

Talking of EC2, we would very much like to use it as a way to give added value to the customers using our application. Certain 3D operations (lighting calculations as an example) can take a fair while on a single compute node. We are hoping to provide a way to take some of this wait time away from people using our app. Imagine the user right-clicks a 3D model and hits the “Compute Lighting” menu. Internally we’ll package up the model, queue it to a bunch of nodes in EC2. When its done Blade3D will pop up a nice message saying the job has completed.

As you can probably tell we’re pretty excited, hope I haven’t wasted too much page space.

  Shiraz Kanga [04.30.07 11:01 PM]

Thanks for providing this information folks.

I am the author of podLoadr (www.podloadr.com) which loads RSS feeds, Web Pages, etc onto your iPod and I've been considering a move to S3 for the downloads. This data will definitely help the decision making process.

  Jim Runsford [05.01.07 02:43 AM]

We are using s3 to offload our hosting of liquid storage media between the point when they are collected from our clients and the subsequent point when they are passed along the relationship chain for redemption. The Java and php/ruby apps we have written to interface with s3 have performed robustly in handling the variably configured media we deal with.

We estimate we have saved approximately $1.23 a month; but the smiles we receive from the local homeless when we are better able to serve them and their motley collections of recyclable cans and bottles, is worth more than gold.

  Ben Widhelm [05.01.07 03:25 PM]

ElephantDrive uses S3 amongst a pool of storage repositories to mirror and occasionally directly deliver our users' backed up data.

While we can't share specific information, we are storing millions of objects at S3, the average size of which is just over half a MB.

At this point we believe the change in pricing announced today will net positively for us, although it is receiving mixed reviews on the message boards for S3. Seems like the incremental costs for PUTs and lists is going to impact anyone passing directly through to S3 or using it as a webserver.

  Benoit Rigaut [05.02.07 05:45 PM]

It was a sort of a no brainer: put all our .flv videos on S3 and just pay the bills!

http://www.trivop.com is intimately tied to the video delivery - we're the first video guide for hotels (a mashup between google maps, pro videos, etc.) - but with S3 we can focus on our true mission: everything that takes place front and back office around these videos.

  Scotty D [05.04.07 09:43 AM]

at FillZ.com, we store all mysql database backups on s3 for diversity, about 100G of data compressed and encrypted, and xfer about 120G/month (a duplicity implementation syncs with s3 nightly), making cost around $40/month. I figure this is about 1/3rd as much as a dedicated hosting/storage solution with zero redundancy. reliability has been more than we could have hoped for, basically issue free for the year that we've been running it. occasionally transfers will hang for 20-120 minutes... really pretty insignificant for a backup situation. The one thing I wish s3 would do and probably never will is allow file updates. if a small portion of a large file changes, it will cost you the bandwidth and the time to re-upload the entire thing. this also makes a reverse-differential solution like rdiff-backup impractical.

  Ben [05.05.07 01:40 PM]

This really sounds like the future direction of all web hosting. There is really no reason to run on individual web servers when there is a low cost solution that is easily scalable and reliable.

  Shuki Lehavi [05.17.07 03:46 PM]

At Gumiyo.com (http://www.gumiyo.com), our users can sell items directly from their cell phones, email or web. We store the media files entirely on S3. We also use EC2 and back up our entire EC2 environment (including our MySQL databases) on S3.

  Anonymous [05.22.07 08:18 AM]

At myDataBus.com we moved our entire storage over to S3 in January and February (yes, it took a while!) and it has turned out to be a really amazing move for us. I doubt we would have survived March trying to scale bandwidth and storage at our own data center. The savings (both cost and headaches) of outsourcing that part of the business has freed us up to spend more time building out features and less time feeding bandwidth and storage to 'the beast!'

This month we started using SQS and EC2 instances for more of our file processing; as that expands I'm starting seeing better site performance and smoother scaling. The fact that file transfers inside of the Amazon 'cloud' are free more than pays for the additional expense of running EC2 instances to process and move files around.

  Brian Davis [05.22.07 08:20 AM]

Bah-- somehow I lost track of my name on that last post: myDataBus is at www.mydatabus.com.

  Travis Reeder, Middlepost Corp. [06.06.07 04:27 PM]

We are using 3 of the Amazon Web Services:



1) EC2 for the application software.

2) S3 for long term data storage and document storage.

3) SQS for messaging between our main application servers and our document conversion servers.



EC2 immediately saves money because it's cheaper than hosting pretty much anywhere else. On the flip side, it's much less reliable than other solutions and you don't get any extra features so I guess you get what you pay for.



As for S3, it's hard to say how much it saves in terms of actual costs for just storage, but the big advantage to us is that we never have to worry about scaling out the storage space or backing it up, etc. So to get an actual cost, you'd really have to calculate the maintenance costs of maintaining your own storage. That being said, we know almost exactly how much it will cost us to store millions of documents on S3 and it's so negligible that we hardly need to consider it a cost.



SQS is just one less maintenance problem for us and is much cheaper than running another server just for messaging.



Get rid of your fax machine, use Middlepost Docs

  Terry Jones [06.18.07 07:28 PM]

What do people think of the AWS Terms & Conditions? I posted a few thoughts at

http://www.fluidinfo.com/terry/index.php/2007/06/19/pondering-the-tc-of-amazons-s3-and-ec2/

Terry

  scott [02.04.08 12:39 AM]

Has anyone tried www.ziddu.com??? I find it really good for free file hosting services..

Best regards
Scott

  Sierra [02.18.08 12:29 PM]

Scott not everyone belive in free hosting I don't If I want to share big files I just pay for 3G account on dreamhost or bluehost and it work perfect and I know that no one will ever dell it.

Post A Comment:

 (please be patient, comments may take awhile to post)






Type the characters you see in the picture above.

RECOMMENDED FOR YOU

RECENT COMMENTS