Apr 24

Tim O'Reilly

Tim O'Reilly

Web 2.0 and Databases Part 1: Second Life

As part of the prep for my keynote on Wednesday at the MySQL User Conference, I decided to ask some of my Web 2.0 friends just how they were using databases in their applications. Over the next couple of days, I'm going to post what I heard back. I'm not going to draw any conclusions till the end of the series, but just let people speak for themselves.

In this first installment, a few thoughts from Cory Ondrejka and Ian Wilkes of Linden Labs, creators of Second Life. Cory wrote:

Your timing is, of course, perfect because we're a) in the midst of converting much of our backend architecture away from custom C++/messaging and into web services and b) we spent yesterday afternoon fighting some database cliff that we just hit.
Since I'm about to get on a redeye, let me introduce you to Ian Wilkes, our Director of Operations and architect of Second Life's database and asset backends. Ian can give you the 10 cent tour and certainly has some keynote worthy war stories from 4 years of work on Second Life. Probably starting with the time he had to give Philip Rosedale and me the "flat files are not going to cut it"-talk :-)

From my end, the worst MySQL moment was when, in the midst of a colo move we decided that we could bring the system back up before we had moved our slave database. After all, what are the odds of the primary going down in the 2 hours it would take to schlep the slave over and bring it up? Apparently the odds were 100%.

Separately -- in your no doubt copious free time -- you might enjoy getting a brain dump on our move to web services. I don't think anyone really groks what's going to happen when we fully connect to the web this way . . .

(I definitely want that brain dump, and will pass it along when I get it!) Meanwhile, over to Ian:

Like everybody else, we started with One Database All Hail The Central Database, and have subsequently been forced into clustering. However, we've eschewed any of the general purpose cluster technologies (mysql cluster, various replication schemes) in favor of explicit data partitioning. So, we still have a central db that keeps track of where to find what data (per-user, for instance), and N additional dbs that do the heavy lifting. Our feeling is that this is ultimately far more scalable than black-box clustering. Right now we're still in the transition process, so we remain vulnerable to overload. As Cory mentioned, we're moving to an HTTP-based internal communication model in order to improve our flexibility.

I think the biggest lesson we learned is that databases need to be treated as a commodity. Standardized, interchangeable parts are far better in the long run than highly-optimized, special-purpose gear. Web 2.0 applications will require more horsepower with less money than One Database or his big brother One Cluster All Hail The Central Cluster will offer. (After all, a 64-way Mysql Cluster installation is just the budget-friendly version of a Sun E-10000.) Unfortunately, this seems to be the minority view, at least if the dearth of automated db provisioning tools is any indication.

Our most interesting war stories don't generally involve the database - yes once we lost data and had to roll back the world a few hours, but who else can claim downtime due to grey goo? Perhaps the best illustration of the lesson above is a story of success. Lots of people have memories and/or fears of racing to the colo to fix the one machine that's bringing down the system; we can bring spare dbs on line from the comfort of our own homes, and worry about repairs at our leisure. "I can add database capacity in my underwear!"

More entries in the database war stories series: Bloglines and Memeorandum, Flickr, NASA World Wind, Craigslist, O'Reilly Research, Google File System and BigTable, Findory and Amazon, Brian Aker of MySQL Responds.

tags: web 2.0  | comments: 13   | Sphere It

Previous  |  Next

0 TrackBacks

TrackBack URL for this entry:

Comments: 13

  Curious [04.25.06 06:45 AM]

When Ian says "Unfortunately, this seems to be the minority view, at least if the dearth of automated db provisioning tools is any indication."

What does he mean by "automated db provisioning tools"?

  Farhan Mashraqi [04.25.06 11:21 AM]

Hi Tim,
I am a speaker at the MySQL UC (talking about Applied Ruby on Rails) and looking out for you. I am a developer of a Web 2.0 application and would like to forward you my thoughts as to how databases are being used in Web 2.0. Where should I send my input? Thanks

  Tim O'Reilly [04.26.06 11:02 PM]

Curious -- I asked Ian to expand on his comments. Here's what he wrote:

I'm talking about tools for setting up new mysql instances - it gets pretty error-prone when replication is involved. As a first pass, I want to be able to do this sort of thing trivially:

Machine X, which has a blank mysql installation, becomes a slave of master A.
Machine Y, which was a slave of master A, becomes master B
Machine Z, which was master C, becomes a slave of master A

It's difficult now because there are multiple manual steps, which often have long waits for data dumps/imports between them, and which can nuke you if you get them wrong. Some people have solved this already but generally through a series of scripts which they haven't released.

If this stuff were a modular toolset, a higher-order routine could use it to say "These 12 machines are database spares. Take 8 of them and make empty inventory instances."

  Brad Greenlee [04.30.06 09:01 AM]

I had a similar thought when reading this, although you could take it a step further. What about a tool that analyzed your database usage and, given a number of slave boxes to configure as it sees fit, automatically configures masters and slaves and distributes your data across those boxes as it sees fit. This would not be a one-time-only process either; it would continue to monitor usage and performance and adjust accordingly.

  RelationalRules [10.12.06 09:46 PM]

I can not believe people are still using MySQL and not a relational database, especially the larger companies that are supposed to have smart people running them. Ruby is a step backwards. I suppose it sells books though.

  Mike [01.15.07 10:17 AM]

There is no such thing as a Relational Database built to date that complies with ALL of the requirements of the relational database definition. For example, Third normal form is often the furthest people do their normalization to, however there are what, like 12 normal forms or something crazy like that? No one database meets all of these requirements to date.

MySQL has its advantages, Oracle has its advantages, but both support only a portion of the SQL standards that are out there, they tend to pick and choose standards, then add their own.

Currently in MySQL 5, it has progressed by leaps and bounds in order to respond to such criticisms, for example, triggers, store procedures, and foreign key relationships are all not supported.

There are people out there who use MS Access for their company data, and people who use Notepad, and people who use file cabinets. Why can you not believe people still use MySQL when there are people out there who still use filing cabinets? File cabinets are very effective in some situations, as MS Access, as is MySQL, as is Oracle.

Maybe people use it because its free, its fast, and it supports almost every feature that MOST people need? My dad's filing cabinet has all the features he requires.

  ryan christensen [01.23.07 10:24 PM]

They are probably using mySQL due to the cost. As they mention "budget-friendly version of a Sun E-10000". mySQL is the best free database. MSSQL, Oracle, how many licenses would you need? lots of them...

  ryan christensen [01.23.07 10:26 PM]

"Ruby is a step backwards"

I wouldn't say that but its pretty basic ActiveRecord that has been done for some time. Marketed well that is for sure. The new thing is functional programming to sell more books. Yes ist very cool and useful but its largely to sell books and conferences.

  Ric Johnson [04.11.07 09:36 AM]


2 things:

1. I think you meant "now supported" instead of "not supported" in paragraph 3. :)

2. There are situations where MS Access is effective? :P

  Jorge [10.01.07 02:27 PM]

Hi Mike:

I have one question can I get flat files from SL?
it is possible?

Can I create flat files from SL?

Jorge Moreno

  mary.8383 [11.20.07 05:39 PM]

We are cheaper linden dollars

Our ebay ID is mary.8383, we have a large ammount of linden dollars, we will trade with u

within 24 hours, and best service for you.

Welcome you make a purchase.We will do our best to help u here.Have a wonderful day!

  sb [12.18.07 02:18 PM]

Hey, spam! Radar crew, how about purging Mary's shilling for second-life schlock?

Post A Comment:

 (please be patient, comments may take awhile to post)

Type the characters you see in the picture above.