AlloyDB backups management

The post is about backup management for AlloyDB. It might be useful for the time when it is written but, probably, will be obsolete very soon when tools and API for the service will mature.
A couple of words about AlloyDB backups and how they are created. The backups are quite different from the default backups for Cloud SQL for example. As we know in Cloud SQL all the backups are bound to the instance. What it means is when the instance is deleted then all the backups disappear along with the instance. It makes sense if the backups behind the scenes are storage snapshots from the databases. But in AlloyDB all the backups are decoupled from the cluster and exist by themselves. If you delete a cluster the backups stay. I think it is a way better approach because it provides a better way to protect from some mistakes when an instance is deleted before making a clone or exporting the data. As for now you can see all the backups for existing and deleted instances using the “backups” tab in the console, gcloud utility or listing using GCP REST API.

But how can we manage the backup? What do we do if we need to implement our own retention policy or simply delete the backups? Here are a couple of ways we can do that.
The first way is the gcloud utility and it provides commands to list, create or delete backups for an AlloyDB cluster. For example, we can list the backups using command:

bastion$ gcloud beta alloydb backups list
NAME                                                                                                                STATUS  CLUSTER_NAME                                                          CREATE_TIME                     ENCRYPTION_TYPE
projects/sandbox/locations/us-central1/backups/automated-bkp-20220914-15-0b07c719-7397-408d-b4e9-779dbdb4c8cb  READY   projects/475708065648/locations/us-central1/clusters/gleb-alloydb-02  2022-09-14T14:44:24.087185163Z  GOOGLE_DEFAULT_ENCRYPTION
projects/sandbox/locations/us-central1/backups/automated-bkp-20220911-15-5420cf02-3588-45d7-89c6-7d836b210d21  READY   projects/475708065648/locations/us-central1/clusters/gleb-alloydb-02  2022-09-11T14:44:16.261711844Z  GOOGLE_DEFAULT_ENCRYPTION

Or we can delete a particular backup using the gcloud beta alloydb backups delete command to delete a particular backup.

That’s great but I am lazy and prefer an automated approach. Of course you might choose to make a bash script and use gcloud there to filter and manage the backups. It might work but it is not a really scalable and reliable approach. Usually I create a function for a service using the service client’s API and trigger it by the Pub/Sub message with the call parameters. Google Scheduler is responsible to post the message to the Pub/Sub topic. Here is a basic diagram (made using the GCP diagram tool):


The same workflow I’ve chosen for the AlloyDB backups management. The only difference was that we didn’t have the client’s API yet and I should use the REST API interface. I prepared a function written in Go language which accepts a PubSub message with a JSON payload where I supply the AlloyDB cluster name, operation type, retention policy and location. Here is an example of the payload.

{ "project":"sandbox",  "location":"us-central1", "operation":"DELETE", "cluster":"ALL", "retention":105}

The function itself is written in Go and can delete backups or create them. You can specify a cluster name or put “ALL” and it will delete all backups in the project for the location according to the retention policy in days. You can find the function source code in my repository on Github.

And it was my first function using the 2nd generation of GCP cloud functions. In reality I probably didn’t need any of the new features provided by the updated engine but I thought that it was the time to move and use the new generation for all the new projects.

The process was not too different from the 1st gen function but it asked me to enable some additional APIs.
I already had a Pub/Sub topic created for the AlloyDB operation and I created the function with “Eventarc trigger” end event type “Cloud Pub/Sub”:

You’ve probably also noticed that I’ve modified the runtime reducing the memory to 128Mb.

I put a timeout 120s just in case and provided the source code.

The function was built and published and I could see the new subscriber on my Pub/Sub topic:

The next step was to schedule the daily run in the Cloud Scheduler to send the payload to the Pub/Sub. The procedure is simple and takes just a couple of minutes.

As a result all the backups for gleb-alloydb-01 older than 106 were scheduled to be automatically deleted from the project. The same way you can schedule creation of the on-demand backups changing the operation to “CREATE”.

I hope the post can help with some basic management automation for the new AlloyDB service. Let me know please if you would like to know more about AlloyDB or any ideas for automation.

Google Config Connector – from GKE add-on to manual

The GCP Config Connector (CC) is a Kubernetes add-on which allows you to create, change or delete cloud resources outside your cluster. It can deploy various GCP resources such as storage, databases, network and others representing them as a set of Kubernetes resources. It helps with a unified approach for deployment using Kubernetes to deploy the full stack for your application. The resources can be deployed by kubectl, helm or any CD (Continuous Deployment) tool you use in the organization.

Continue reading “Google Config Connector – from GKE add-on to manual”

Anthos Config Connector and Redis security

A short disclaimer. I am writing it in the middle of March 2022 and it is possible that when you read the blog the information published here is not relevant anymore. Cloud products are evolving very fast.

I write the post to share some observations and potential issues you might have with deploying GCP Memorystore for Redis instances through Anthos Config Connector (ACC) controller. If you are not familiar with ACCI, I strongly recommend reading at least a high level overview of the product. In essence this is a Kubernetes addon which allows you to automatically deploy and manage GCP services by applying a manifest file (YAML or Helm chart) to a Kubernetes cluster with the ACC controller. It allows you to use the Kubernetes cluster as a deployment tool for GCP resources in your organization. This is a really interesting approach and might transform your environment in the cloud. But it implies some challenges around security which I am going to discuss in the blog.

Continue reading “Anthos Config Connector and Redis security”

Google Cloud SQL Custom Backups

In one of my previous posts I’ve noted that the GCP Cloud SQL for SQL Server doesn’t have point of time recovery as of March 2022. As result the default out of box backups can only provide RPO as 24 hours or more. The exact RPO might vary from day to day since you can only specify a window for backup but not exact time. So far it seems like the only reasonable approach to reduce the RPO is to schedule on-demand backups, and in this post I am going to show how you can do that using a couple of different approaches.

Continue reading “Google Cloud SQL Custom Backups”

Is Google Cloud SQL Server enterprise ready?

Before starting the post let me clarify that what I am going to describe as the state of readiness of the Google Cloud SQL Server is actual for early February 2022. It is quite possible that some things can be different when you read the post.

For the last several months I was helping some big enterprises to adopt Google Cloud Platform (GCP) and, as part of the implementation, a significant number of SQL Server databases were moving to the GCP Cloud SQL service. But when we started to build the environment in GCP it was clear that the SQL Server option for Cloud SQL is much inferior not only to some other cloud offerings and on-prem installations but also to other databases engines on the same Cloud SQL. In short the SQL Server on GCP Cloud SQL service lacked some essential features. Here I will try to explain why I think the SQL Server in GCP is not mature enough for enterprise.

Continue reading “Is Google Cloud SQL Server enterprise ready?”

Docker image to work with GKE in GCP

Lately I work primarily with Google Public Cloud (GCP) and in particular with Kubernetes services (GKE). As result my daily routine command line tools are gcloud, kubectl, nomos and other. And when the GCP cloud shell is really amazing environment which doesn’t require any effort to fire up, sometimes it is not possible to use. When it comes to work from your own laptop you have different options. You can install the tools like Google Cloud SDK following several simple steps from the Google website or you can prepare a docker image and run it in a container. I personally prefer the second way. In such case I can periodically update entire environment without too much effort and easily can span a new environment on any laptop fairly quickly. Here I am sharing what I personally use for my day-to-day activity.

Continue reading “Docker image to work with GKE in GCP”

New kid in the block – Rocky linux.

If you’ve been following the recent changes in the linux world you probably remember how Red Hat and Centos announced in December 2020 that the CentOS Project was shifting focus to CentOS Stream and support for CentOS Linux 8 had been cut to December 31, 2021. It created a wave of discussions in the community about the future for Centos as an enterprise platform and some people started to look to alternative Linux distributives. As a result we got a new, community-driven downstream built, same as Centos used to be, Rocky linux.

The downstream build is based on the same code base as the vendor distributive and resembles most features of the “parent” vendor Linux. It is following all the releases after they have been built by the vendor. In most of my tests I am using Oracle Linux when I am in the Oracle cloud but I am using Centos in Google cloud and other public clouds like Azure or AWS. Now we have Rocky Linux available on those platforms and I’ve had a quick look and done some testing using the Rocky Linux 8.4 (Green Obsidian).

Continue reading “New kid in the block – Rocky linux.”

Google Bare Metal in numbers.

In the previous posts I shared my first impression and how to start using the Google Bare Metal Service (BMS). In this post I will try to show some numbers related to the performance of the solution and you can compare it with your existing environment.

Let me start from the box characteristics. For my tests I was using a “o2-standard-32-metal” box located in the us-west2 zone (Los Angeles) . The solution was configured with 2Gbps interconnect and had a couple of storage resources attached to it. The first one was represented by two 512Gb disks based on HDD storage where I placed my binaries and a recovery ASM disk group and the second was a 2Tb volume “all flash” I used for data.  Here is summary table:

Characteristic
BMS Box typeo2-standard-32-metal
CPU Intel(R) Xeon(R) Gold 6234 CPU @ 3.30GHz
CPU sockets2
CPU cores16
Memory384 GB
Disk 1512 Gb – Standard disk
Disk 2 512 Gb – Standard disk
Disk 3 2048 Gb – All flash
Network4 NICs Speed: 25000Mb/s
OS Oracle Linux 7.9
BMS box characteristics.

Before starting the tests I updated my Oracle Linux and installed a number of packages required for my Oracle database and packages to test IO and Network such as fio and iperf3. Here is a summary table with software and tools used to test the performance.

PackageTesting scope
fioIO performance
stress-ngCPU. Memory
swingbenchOracle database performance
SLOBOracle database IO
iperf3Network
oratcptestNetwork
Continue reading “Google Bare Metal in numbers.”

Google Bare Metal – how to start.

In the previous post I put some of my thoughts on why you would use the Google Bare Metal Service (BMS) and my first impression about it. In this post I want to talk about the first steps and how you can start to work with the Google Bare Metal Service (BMS).

To put your hands on BMS you need to contact your Google Cloud sales representative and order it. It means you need to know to some extent your requirements and prepare for that. The major preparation steps are described in the Google documentation and here I will try to go through some of them.

The first main step is to outline your architecture and identify the region for the BMS. The service is a region extension and it means it is connected to your regional Google cloud infrastructure by high speed low latency network interconnect. It makes sense to place it where the most of your applications and users are going to be. For example, in my case I’ve chosen the us-west2 (Los Angeles) and it was aligned with my main test app servers and provided the best response time. The 64 bytes ping from an app server in the same region was 0.991 ms on average.

Continue reading “Google Bare Metal – how to start.”

Google Bare Metal for Oracle.

Since the first days of working in the Google public cloud there have been debates about the possibility to move an Oracle workload to GCP. The major concerns were coming not from the technical challenges but rather from Oracle’s licensing policies and guidelines. In the famous Oracle’s document about licensing Oracle software in the public cloud it was stated – “This policy applies to cloud computing environments from the following vendors: Amazon Web Services – Amazon Elastic Compute Cloud (EC2), Amazon Relational Database Service (RDS) and Microsoft Azure Platform (collectively, the ‘Authorized Cloud Environments’)”. So the Google Cloud was not listed as an ‘Authorized Cloud Environment’ and it was unclear how to apply the Oracle licensing there. I believe it will be sorted in time but in the meanwhile as a solution Google presented a Bare Metal Service as the platform for Oracle workload.

Continue reading “Google Bare Metal for Oracle.”