Tuesday, September 14, 2010

MongoDB "good enough practices"

I have been using MongoDB for about a year for customer jobs and my own work and I have a few practices that are worth sharing:

I use two levels of backup and vary the details according to how important or replaceable the data is: I like to perform rolling backups to S3 periodically. This is easy enough to do using cron, putting something like this in crontab:
5 16 * * 2 (cd /mnt/temp; rm -f -r *.dump*; /usr/local/mongodb/bin/mongodump -o myproject_tuesday.dump > /mnt/temp/mongodump.log; /usr/bin/zip -9 -r myproject_tuesday.dump.zip myproject_tuesday.dump > /mnt/temp/zip.log; /usr/bin/s3cmd put myproject_tuesday.dump.zip s3://mymongodbbackups)
The other level of backup is to always run at least one master and one read-only slave. By design, the preferred method for robustness is replicating mongod processes on multiple physical services. Choose master/slave or replica set installations, but don't run just a single mongod.

I often need to do a lot of read operations for analytics or simply serving up processed data. Always read from a read-only slave unless the small consistency hit (it takes a very short amount of time to replicate master writes to slaves) is not tolerable for your application. For applications that need to read and write, just either keep two connections open or use a MongoDB ORM like Mongoid that supports multiple read and write mongods.

Another thing I try to do is to place applications that need to perform high volume reads on the same server that runs a MongoDB slave; this eliminates network bandwidth issues for high volume "mostly read" applications.

No comments: