This is the fourth and final blog post of the Red Hat OpenShift on Nutanix HCI series. The previous three blogs covered the deployment of an OpenShift cluster on Nutanix and the installation of the Nutanix CSI Operator on OpenShift to consume the rich set of Nutanix Unified Storage. The below post picks up from the third blog so at this point, you should ideally have a fully functional OpenShift cluster combined with Nutanix data services. If you haven’t had the chance to take a look, please review them here – Blog 1, Blog 2, Blog 3.
Overview
Although Kubernetes was originally designed to run stateless workloads, the technology has matured over time and enterprises are increasingly adopting the platform to run their stateful applications. In a survey conducted by the Data on Kubernetes community, 90% of the respondents believe that Kubernetes is ready for stateful workloads, and 70% of them are already running them in production with databases taking the top spot. Having the ability to standardize different workloads on Kubernetes and ensure consistency are seen as the key factors that drive value for businesses.
Nutanix provides an industry-leading HCI platform that is ideal for running cloud-native workloads running on Kubernetes at scale. The Nutanix architecture offers better resilience for both Kubernetes platform components and application data. With the addition of each HCI node, apart from scaling the Kubernetes compute nodes, there is an additional storage controller as well which results in improved storage performance for your stateful applications.
The Nutanix Unified Storage is made available to cloud-native applications with the Nutanix CSI driver. Applications use standard Kubernetes objects such as PersistentVolumeClaims, PersistentVolumes, and StorageClasses to access its capabilities. The CSI driver also enables users to take Persistent Volume snapshots using API objects VolumeSnaphot, VolumeSnapshotContent, and VolumeSnapshotClass. Snapshots represent a point-in-time copy of a volume and can be used to provision a new volume or to restore existing volumes to the previous snapshotted data. OpenShift Container Platform deploys the snapshot controller and the related API objects as part of the Nutanix CSI Operator as described in Blog 3.
In this blog, we will be deploying a PostgreSQL database and see how data stored on the Nutanix platform can be recovered in the event of a disaster, by leveraging the Nutanix CSI Operator.
Prerequisites
- Install Git
Please clone this Github repo before proceeding further. There are YAML manifest files and scripts that should assist you with the process.
git clone https://github.com/nutanixdev/stateful-app_ocp_nutanix.git && cd stateful-app_ocp_nutanix
- Install the OpenShift CLI (oc)
In this demo, we will be using oc to manage the OCP, which is similar to kubectl. Please note that most of the operations can be performed from the OpenShift web console interface as well.
- Install Helm
We will be using the Helm package manager to deploy PostgreSQL.
Note: The OpenShift Container Platform also provides OperatorHub in the web console interface, from where you can install numerous applications as Operators.
Verifying Nutanix CSI Operator storage
- Ensure that the CSI pods are in Running state.
oc get pods -n ntnx-system
- Ensure that the CSI driver secret used while interacting with the storage system is accurate.
oc get secret ntnx-secret -n ntnx-system -o jsonpath='{.data.key}' |base64 -d && echo
- Create a StorageClass from the provided YAML manifest. We will be using Nutanix Volumes to provide block storage. In case you have already created a StorageClass in the previous blog, you will still have to create this as the rest of this blog assumes there will be a StorageClass named nutanix-volumes.
allowVolumeExpansion: true apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: storageclass.kubernetes.io/is-default-class: "true" name: nutanix-volumes parameters: csi.storage.k8s.io/provisioner-secret-name: ntnx-secret csi.storage.k8s.io/provisioner-secret-namespace: ntnx-system csi.storage.k8s.io/node-publish-secret-name: ntnx-secret csi.storage.k8s.io/node-publish-secret-namespace: ntnx-system csi.storage.k8s.io/controller-expand-secret-name: ntnx-secret csi.storage.k8s.io/controller-expand-secret-namespace: ntnx-system csi.storage.k8s.io/fstype: ext4 #isSegmentedIscsiNetwork: is-segmented-iscsi-network flashMode: ENABLED storageContainer: SelfServiceContainer #chapAuth: ENABLED | DISABLED storageType: NutanixVolumes #whitelistIPMode: ENABLED/DISABLED #whitelistIPAddr: ip-address provisioner: csi.nutanix.com reclaimPolicy: Delete
oc create -f manifests/storageclass.yaml
Install PostgreSQL
Let’s go ahead and deploy the Helm chart for PostgreSQL.
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install postgresql-prod bitnami/postgresql
Please refer to this guide if you wish to tweak the values and customize the installation.
Wait until the database and the corresponding pods are running. Also, verify that the PV is created and the PVC is bound to the PV.
oc get statefulset
NAME READY AGE
postgresql-prod 1/1 94s
oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-prod-0 1/1 Running 0 63s
oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-postgresql-prod-0 Bound pvc-3b93dbf4-05fe-4e57-bac5-f9c696863f6a 8Gi RWO nutanix-volumes 116s
Note: This creates a default user "postgres" for the database.
To get the password for the user, please run this.
$ oc get secret --namespace default postgresql-prod -o jsonpath="{.data.postgres-password}" | base64 --decode && echo
To login to the database, please run this.
$ oc exec -it postgresql-prod-0 -- psql -U postgres -d postgres -p 5432
Workflow
We will be populating the PostgreSQL database with some sample data. The database consists of multiple related tables. A table consists of rows and columns which store structured data.
We will be creating two tables, one for Nutanix and the other for Red Hat, and inserting values into both these tables. To automate this process, there is a script provided that inserts the data at different times.
Also, there is a second script provided that takes volume snapshots of the database at regular intervals. Similar to the StorageClass, the VolumeSnapshotClass object describes the classes of storage when provisioning a volume snapshot.
Note the driver csi.nutanix.com used in the YAML manifest volumesnapshotclass.yaml.
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
name: nutanix-volume-snapshot-class
driver: csi.nutanix.com
parameters:
storageType: NutanixVolumes
csi.storage.k8s.io/snapshotter-secret-name: ntnx-secret
csi.storage.k8s.io/snapshotter-secret-namespace: ntnx-system
deletionPolicy: Delete
The VolumeSnapshot object is similar to a PVC – it denotes the request for a volume snapshot from a user. Here we dynamically provision a snapshot by specifying a PVC as the data source.
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
name: postgresql-snapshot-0
spec:
volumeSnapshotClassName: nutanix-volume-snapshot-class
source:
persistentVolumeClaimName: data-postgresql-prod-0
The second script will take a snapshot of the database volume every minute for a total of five times. During this time, the first script would have modified the database contents.
Minute | Data Insertion | Snapshot |
0 | ✓ | ✓ |
1 | – | ✓ |
2 | – | ✓ |
3 | ✓ | ✓ |
4 | – | ✓ |
The table above displays the data objects created after the successful execution of both scripts. Read ahead and run the outlined steps to ingest data and further, verify it in the Data verification section after that.
Let us ensure that there are no tables in the database currently.
export POSTGRES_PASSWORD=$(oc get secret --namespace default postgresql-prod -o jsonpath="{.data.postgres-password}" | base64 --decode)
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "\d ;"
Did not find any relations.
- Create the VolumeSnapshotClass.
oc create -f manifests/volumesnapshotclass.yaml
- Run the data generator script.
nohup /bin/bash scripts/data.sh &>/dev/null &
- Run the snapshot generator script.
nohup /bin/bash scripts/snapshot.sh &>/dev/null &
Verify both the scripts are running in the background.
$ jobs
[1]- Running nohup /bin/bash scripts/data.sh &> /dev/null &
[2]+ Running nohup /bin/bash scripts/snapshot.sh &> /dev/null &
Data verification
You can see that the nutanix table has been inserted right after executing the script. Also, there is a row inserted into the table with some values.
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "\d ;"
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | nutanix | table | postgres
public | nutanix_id_seq | sequence | postgres
(2 rows)
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "SELECT * FROM nutanix ;"
id | name | location | address | created_on
----+--------+----------+----------+---------------------
1 | Karbon | USA | San Jose | 2009-09-01 00:00:00
(1 row)
This is also probably the time you should take a coffee break and get back to your desk after more than five minutes. Let us wait for the scripts to finish execution.
If you check the data that exists after a while, you should see that the redhat table has also been added.
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "\d ;"
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | nutanix | table | postgres
public | nutanix_id_seq | sequence | postgres
public | redhat | table | postgres
public | redhat_id_seq | sequence | postgres
(4 rows)
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "SELECT * FROM redhat ;"
id | name | location | address | created_on
----+-----------+----------+----------------+---------------------
2 | OpenShift | USA | North Carolina | 1993-03-26 00:00:00
(1 row)
Finally, verify that all the five volume snapshots have been created. Note that the time difference between each of them is one minute.
oc get volumesnapshot
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
postgresql-snapshot-0 true data-postgresql-prod-0 8Gi nutanix-volume-snapshot-class snapcontent-5e8b96c9-f391-4678-8020-f63fdf299309 8m27s 8m28s
postgresql-snapshot-1 true data-postgresql-prod-0 8Gi nutanix-volume-snapshot-class snapcontent-ffab2731-d502-45c7-8f60-73a185c6125b 7m27s 7m28s
postgresql-snapshot-2 true data-postgresql-prod-0 8Gi nutanix-volume-snapshot-class snapcontent-466db932-c631-4f7f-8c44-65d8e70bb16e 6m28s 6m28s
postgresql-snapshot-3 true data-postgresql-prod-0 8Gi nutanix-volume-snapshot-class snapcontent-e1dab8a1-68ae-4904-8ae8-fe914d2525cf 5m28s 5m28s
postgresql-snapshot-4 true data-postgresql-prod-0 8Gi nutanix-volume-snapshot-class snapcontent-12e32469-d684-4a48-bc3c-5520d08b9296 4m27s 4m28s
Simulating application failure
We can simulate a PostgreSQL database failure by deleting the StatefulSet and the associated PVCs.
helm uninstall postgresql-prod
oc delete pvc data-postgresql-prod-0
Data restoration
After you verify that the database has been removed, we can restore the data from the snapshots first and then deploy the database again.
Let’s assume we wished to restore the production database from the latest snapshot available. We would create a PVC pointing to the fifth snapshot as the data source (postgresql-snapshot-4).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-postgresql-prod-0
spec:
storageClassName: nutanix-volumes
dataSource:
name: postgresql-snapshot-4
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
oc create -f manifests/pvc-prod.yaml
Once we verify the PVC has been bound to the PV, we can go ahead and deploy the database again.
helm install postgresql-prod bitnami/postgresql
Wait for a couple of minutes for the database to be in a Running state and then verify the data. We see the redhat table and the correct values in the columns as expected.
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "\d ;"
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | nutanix | table | postgres
public | nutanix_id_seq | sequence | postgres
public | redhat | table | postgres
public | redhat_id_seq | sequence | postgres
(4 rows)
oc exec -it postgresql-prod-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "SELECT * FROM redhat ;"
id | name | location | address | created_on
----+-----------+----------+----------------+---------------------
2 | OpenShift | USA | North Carolina | 1993-03-26 00:00:00
(1 row)
Now say the dev team wanted to rollback to an earlier snapshot to review some changes, we could create another PVC and this time pointing to the first snapshot as the data source. (postgresql-snapshot-0).
The YAML manifest is provided as pvc-dev.yaml.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-postgresql-dev-0
spec:
storageClassName: nutanix-volumes
dataSource:
name: postgresql-snapshot-0
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
oc create -f manifests/pvc-dev.yaml
Deploy another instance of the database as postgresql-dev.
helm install postgresql-dev bitnami/postgresql
Once the pods are running. we can verify that the nutanix table exists but not the redhat table!
oc exec -it postgresql-dev-0 -- psql "postgresql://postgres:$POSTGRES_PASSWORD@127.0.0.1/postgres" postgres -c "\d ;"
List of relations
Schema | Name | Type | Owner
--------+----------------+----------+----------
public | nutanix | table | postgres
public | nutanix_id_seq | sequence | postgres
(2 rows)
Summary
The Nutanix cloud platform has consistently been recognized as a leader for providing unified storage solutions. Now we have seen that it can run containerized stateful workloads as well in production. Key enterprise use cases such as disaster recovery are addressed through snapshots and restoration of data. The Nutanix CSI Operator also delivers other key features such as Volume expansion and Volume cloning which have not been explored in this blog.
Nutanix and Red Hat work together to delight customers with a full-stack platform that can build and scale containerized and virtualized applications in a hybrid multi-cloud environment.
Note: If you wish to delete the PostgreSQL database and the associated objects and restore the OpenShift cluster back to the default state, then run the reset.sh script that’s provided.
$ nohup /bin/bash reset.sh &>/dev/null &
To read all parts in this series, please use the links below.