During my discussions with customers I’ve sometimes heard some incorrect expectations and assumptions when people are defining their backup and recovery strategy. As a database and, in general, data centric person I think it is quite important to understand what the Point In Time Recovery (PITR) means and what Google Cloud SQL can do and what it cannot do. The information here is relevant for September 2022 when the post has been written.
Let’s start from the point of time recovery and how it works. From the high level of view the PITR should provide the ability to restore and recover your data up to the last seconds defining a desirable point of time in the past to represent the consistent dataset at that moment.
To achieve that goal the database recovery uses a combination of backup and stored transaction logs. The transaction logs contain sequential records with all the changes applied to the database dataset. The logs have different names, such as binary log, Write-Ahead Logging (WAL) or Redo logs, but conceptually they are designed to store and apply information for recovery purposes. To recover the database instance should be restored from the latest suitable backup which was completed before the PITR and apply all the changes from the transaction logs starting from time when the backup had started and until the PITR time.
The post is about backup management for AlloyDB. It might be useful for the time when it is written but, probably, will be obsolete very soon when tools and API for the service will mature. A couple of words about AlloyDB backups and how they are created. The backups are quite different from the default backups for Cloud SQL for example. As we know in Cloud SQL all the backups are bound to the instance. What it means is when the instance is deleted then all the backups disappear along with the instance. It makes sense if the backups behind the scenes are storage snapshots from the databases. But in AlloyDB all the backups are decoupled from the cluster and exist by themselves. If you delete a cluster the backups stay. I think it is a way better approach because it provides a better way to protect from some mistakes when an instance is deleted before making a clone or exporting the data. As for now you can see all the backups for existing and deleted instances using the “backups” tab in the console, gcloud utility or listing using GCP REST API.
The GCP Config Connector (CC) is a Kubernetes add-on which allows you to create, change or delete cloud resources outside your cluster. It can deploy various GCP resources such as storage, databases, network and others representing them as a set of Kubernetes resources. It helps with a unified approach for deployment using Kubernetes to deploy the full stack for your application. The resources can be deployed by kubectl, helm or any CD (Continuous Deployment) tool you use in the organization.
A short disclaimer. I am writing it in the middle of March 2022 and it is possible that when you read the blog the information published here is not relevant anymore. Cloud products are evolving very fast.
I write the post to share some observations and potential issues you might have with deploying GCP Memorystore for Redis instances through Anthos Config Connector (ACC) controller. If you are not familiar with ACCI, I strongly recommend reading at least a high level overview of the product. In essence this is a Kubernetes addon which allows you to automatically deploy and manage GCP services by applying a manifest file (YAML or Helm chart) to a Kubernetes cluster with the ACC controller. It allows you to use the Kubernetes cluster as a deployment tool for GCP resources in your organization. This is a really interesting approach and might transform your environment in the cloud. But it implies some challenges around security which I am going to discuss in the blog.
In one of my previous posts I’ve noted that the GCP Cloud SQL for SQL Server doesn’t have point of time recovery as of March 2022. As result the default out of box backups can only provide RPO as 24 hours or more. The exact RPO might vary from day to day since you can only specify a window for backup but not exact time. So far it seems like the only reasonable approach to reduce the RPO is to schedule on-demand backups, and in this post I am going to show how you can do that using a couple of different approaches.
Before starting the post let me clarify that what I am going to describe as the state of readiness of the Google Cloud SQL Server is actual for early February 2022. It is quite possible that some things can be different when you read the post.
For the last several months I was helping some big enterprises to adopt Google Cloud Platform (GCP) and, as part of the implementation, a significant number of SQL Server databases were moving to the GCP Cloud SQL service. But when we started to build the environment in GCP it was clear that the SQL Server option for Cloud SQL is much inferior not only to some other cloud offerings and on-prem installations but also to other databases engines on the same Cloud SQL. In short the SQL Server on GCP Cloud SQL service lacked some essential features. Here I will try to explain why I think the SQL Server in GCP is not mature enough for enterprise.
Lately I work primarily with Google Public Cloud (GCP) and in particular with Kubernetes services (GKE). As result my daily routine command line tools are gcloud, kubectl, nomos and other. And when the GCP cloud shell is really amazing environment which doesn’t require any effort to fire up, sometimes it is not possible to use. When it comes to work from your own laptop you have different options. You can install the tools like Google Cloud SDK following several simple steps from the Google website or you can prepare a docker image and run it in a container. I personally prefer the second way. In such case I can periodically update entire environment without too much effort and easily can span a new environment on any laptop fairly quickly. Here I am sharing what I personally use for my day-to-day activity.