gatesvp1's comments

gatesvp1 · on April 16, 2012

This is a long-outstanding bug, over 2 years old now:

https://jira.mongodb.org/browse/SERVER-863

Obviously you can vote for them to fix the bug. But it's been two years, so I'm not sure it's really high on the priority list.

gatesvp1 · on April 16, 2012

Not a personal attack as Catch is a neat product, but these numbers are basically irrelevant.

This type of load can easily be handled by a simple SQL box. We did these types of #s with a single SQL Server box 4 years ago, except that your "total documents" was our daily write load.

gatesvp1 · on April 16, 2012

The big thing here is cost.

> If you put the above rules together, you can see that the minimum MySQL deployment is four servers: two in each of two colos...

The ideal scenario is to have 4 "fully equipped" nodes, 2 in each data center. That means having 3 pieces of expensive "by the hour" hardware sitting around doing basically nothing. (and paying 4-5k / computer for MongoDB licenses)

In that scenario you can have everything on instance store and live with 4 copies on volatile storage.

Of course, no start-up wants to commit that many resources to a project. It's far cheaper just to use EBS and assume that the data there is "safe". Is it bad practice, would I avoid EBS like the plague? You bet!

But it's definitely cheaper and that's hard to beat.

gatesvp1 · on April 16, 2012

> A more "Mongo" approach is to migrate data as you need to; e.g. you pull in an older document and at that time add any missing fields.

I think this works when you're talking about adding fields. But it really breaks down if you need to make some change regarding "sub-objects" or arrays of "sub-objects".

If you have made a modeling mistake and you need to pull out a sub-object you generally have to do simply stop the system and migrate.

bitops · on April 18, 2012

That's very true. Sometimes you can't do a live migration and you have to bite the bullet.

jeltz · on April 20, 2012

Which means it is not too much unlike a quality RDBMS like PostgreSQL. For adding (remember to not set a default value for large tables) and removing fields in PostgreSQL there is no requirement for locking the tables more than an instant. But for complicated schema changes you may have to either force a long lock or use some messy plan either involving replication or doing the migration in multiple steps.

gatesvp1 · on April 16, 2012

So that's not actually "safe". If you issue an insert in the default "fire and forget" mode and that insert causes an error (say a duplicate key violation), no exception will be thrown.

Even with journaling on your code does not get an exception.

Journaling is a method for doing "fast recovery" and flushing to disk on a regular basis. "Write Safety" is method for controlling how / where the data has been written. So these are really two different things.