Choosing the best strategy to backup data from Firestore.

Choosing the best strategy to backup data from Firestore.

Written by Benjamin Ssempala

One of the most critical security practices in managing any database is setting up some sort of regular backup. It is because however much data loss is not the most common occurrence, its effects are also not the most pleasant, if not disastrous.

Firebase and Firestore provide a fully managed platform for mobile, web, and server development. Unfortunately, there’s no “big yellow button” to set up the backup process, at least not yet, so we have to do the process ourselves and often find ourselves with a plethora of tools, methods, and platforms to choose from.

CHOOSING A PLATFORM

There are various cloud service providers out there, but unless you’ve got a good reason, I’d advise sticking to the big 3, and those are Amazon Web Services(AWS), Microsoft Azure, and Google Cloud Platform(GCP). Any of these three should sufficiently handle your organization’s needs. Note that each provider has its advantage over the others with regards to the number of services available, ease of integration, reliability, and what your company may already be using.

AWS and Azure have both been around longer than GCP and have more services, with AWS having the most computing capacity and largest catalog of services, of the three. Azure is relatively cheaper for most services with an impressive set of AI, ML, and analytics services. However, GCP harmonizes well with other Google services and products and firebase is one of those Google products. Firebase storage is also powered by Google Cloud Storage allowing stored files to be easily accessed by other projects running on Google Cloud Platform hence enabling GCP and Firebase to integrate seamlessly. Also, since Firebase and GCP have the same underlying account system, a firebase project can easily be used with any GCP product.

Furthermore, there’s already sufficient documentation regarding integrating the two so you don’t have to look up and down the internet for hours in case you need guidance. For those reasons and more, GCP is best suited to store your data backups from Firebase Firestore.

CHOOSING A METHOD

Now the work begins. Like the cloud platforms above, there are plenty of options to implement the backups, but to save yourself the headache, using a Cloud function and a cloud scheduler job might be the easiest option. For context, let’s go through a couple of other alternatives I’ve explored.

Using a cron job

You can choose to utilize AppEngine cron jobs to run scheduled backups of data in Cloud Firestore. It was my first option before I found out about Cloud Functions and Cloud scheduler, which I opted for as you’ll find out below. Here’s a detailed walkthrough to implement this option with all the files you need. Find the link to the Cron job Github repo.

Using Node Js scheduling libraries.

It is similar to using cron jobs. Here, you’ll have access to a lot of scheduling libraries accessible on Open Base. The advantage is that most libraries have sufficient documentation, an active community, and updated analytics and statistics.

Using Github Actions.

Another option is using Github actions to schedule the backup. First, you need a GitHub repo with your project — in case you are already using it, great! Follow this detailed walk-through to set up your project.

Using Cloud Workflows.

Our second last option is creating a fully managed, serverless, and automatic workflow that triggers the Firestore export/backup API and places the data into a Cloud Storage bucket. You can follow this detailed walk-through to set up your project.

Using Cloud Functions and Cloud scheduler.

Finally, our last recommended option is creating a Node.js Cloud Function that initiates a Cloud Firestore data export and a Cloud Scheduler job to call that function.

It is the recommended solution from Firebase in their official documentation. It’s seamless and not very complicated. I’ve found it my go-to option when setting up a firestore backup. Here’s a link to the documentation.

BEST PRACTICES FOR ANY BACKUP STRATEGY.

  • TEST THE BACKUP STRATEGY AND RECOVERY.Ensure the backup implementation is working correctly and test out the backup and recovery process now and then. There are lots of small issues that can stop the backup process in the background, for example, permission issues, storage setup, and payment issues.
  • HAVE A RETENTION SPAN FOR THE DATA BACKUPS.No amount of storage is infinite so have a data bucket lifecycle to delete or archive data that’s old, replicated, or would no longer be of use. Depending on the organization or type of data, this duration can vary from mere hours to months. Luckily or intentionally, GCP has a lifecycle feature that you can set up automatically.
  • REGULAR OR AUTOMATED BACKUPS.
    Naturally, humans tend to find repetitive tasks uninteresting and easily undermine them. But, you can never know when you’ll need to recover data. Otherwise, if we only backup when we know, we probably wouldn’t need this article. Automation will simplify the process and ensures data is backed up regularly.
  • PRIORITISE OFFSITE STORAGE.
    Storing backups on-site can be disastrous if a disaster compromises your entire facility, the data backup will likely also be compromised. Cloud-based storage is becoming quite the norm.
  • ENCRYPT BACKUP DATA.
    The backup data needs to be encrypted to add a layer of security and prevent data theft and corruption. GCP has an option to use a google managed Key to automatically encrypt the data for you, thereby saving you the hassle, unless, of course, you prefer to use your encryption methods.
  • HAVE A COST OPTIMIZATION PLAN.
    It’s important to set a plan to minimize costs while using cloud storage to optimize the process and fit the budget constraints of this organization. This can include careful choosing of which data is to be backed up, and how often the data change for it to be backed up at the chosen frequency.

Afterthoughts

It is very key to implement backups in your service. At one point, I sure hope Firebase has that “big yellow button” to automate the whole process. But as of now, ensure to implement a backup functionality on your data to prevent scream-worthy disasters.

I hope this article has given you some insight into choosing your best-fit backup strategy.


Explore our digital tools. Learn about the quality of air around you. Click here to get started.