I have a 3 member replica set with one primary and two secondaries. My current strategy for backup is : I have setup a cron job which uses mongodump to backup one of the secondaries and uploads to Amazon S3.
What strategy do you use? Would love to know suggestions/best practices to follow for backups.
mongodump is not a recommend way if your data size is huge say >100 gb. It takes a long time to backup and to restore even more. Best way is to take file system snap shot. Refer this guide - docs.mongodb.org/v3.0/tutorial/backup-with-filesy…
For sharded cluster refer this guide - docs.mongodb.org/v3.0/tutorial/backup-sharded-clu…
We use mongodb.com/cloud is a service from MongoDB inc which provides backup, automation and monitoring.
I personally would do something similar. Next week I will start to write a service for my home automation where a part of it will be to backup the databases. Amybe then I will have another solution but currently I would also use a cron job.
Something which I've seen being used on PostgreSQL databases which could also be used on MongoDB with a little bit of work ... RabbitMQ is added in front of the database, any insert / update / delete that is done on the primary DB is thrown on a fanout exchange which fans it out to other databases where the same operation is then applied.
So if you have 3 queues to fanout to, any operation that is done on the primary database is replicated via RabbitMQ to 3 other databases without performance penalties. If you need to do maintenance on a database, simply switch to a backup copy of the DB, any updates you're missing will be queued on RabbitMQ and be applied as soon as the DB becomes available.
So this is effectively backup via replication.
I use 3 approach for backup:
Vasan Subramanian
Previous generation techie
I use periodic mongodump, as well as a copy of the transaction logs so that the DB can be re-created from scratch if required. But that's my special use case, it may not be good for you. In production workloads such as Hashnode itself, you'll need to think of the following:
In my experience, I have had to use backups just a couple of times due to hardware failure, but many a time due to human error, including application bugs.
Availability
MongoDB replication takes care of this. If the primary goes down, MongoDB automatically switches over to one of the replicas. You will need monitoring mechanisms to alert you of failures so that you can quickly bring back another node as a replica.
But you can't rely on replication to mitigate disasters or human errors -- if someone drops a collection by mistake, the collection will disappear in the replicas as well!
Disaster
If you can't afford to lose any data, even on disasters, you should be thinking of having replicas across availability zones or even regions. Typically, availability zones within a region are isolated enough such that a disaster in one zone will not affect the other. But a tsunami can destroy an entire region.
The problem with cross-region replication is that it's going to be slow. If you think you really need it, give sufficient attention to the write concern that you use, and test this out properly before you put this in production.
If you are OK to lose some data, say one day's worth on the unlikely event such as a disaster, then it's OK to not have special handling for this, instead, just rely on the periodic dump as described below.
Human Errors
To recover from human errors, you must have the ability to do Point-in-time-recovery (PITR). The simplest form is a periodic dump as you are doing, but keep the old dumps. This could be mongodump (smaller) or the file-system based snapshot (faster) if your data center provider supports it. And store it on a reliable storage system such as S3 (especially if you are using this mechanism to handle disasters as well).
I have, in the past, kept one dump for every day of the week (ie, the cron-jobs would overwrite the Monday backup every Monday), and one for the 1st of every month. If you have enough storage, you could easily keep every dump rather than do the rotation.
If your data is big and it takes a lot of time and/or resources to create that dump, you need to think of incremental backups. You could use the oplog yourself, or indirectly via tools such as Tayra. I have used a similar technique with PostreSQL's Write Ahead Logs (WAL) to implement PITR, but not with MongoDB.
But remember that PITR using oplog / WAL is quite complex to set up. It gets even more complex if you have to use S3. The restore is not quite straightforward either. Remember also that you'll probably be using PITR the most often, so the simpler it is, the better.
I suggest you start with mongodump or file-system based snapshot on one of the secondaries as you are doing, till a point where it starts affecting performance. You could think of incremental backups a that point in time.
Take a look at Backup vs. Replication for an even more detailed discussion.