How to Deploy Apache Cassandra on Vultr Kubernetes Engine

Introduction

Apache Cassandra is an open-source distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It’s distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications. It also provides tunable consistency, accommodating varying data consistency needs.

The K8ssandra project is a collection of components, including a Kubernetes operator designed to automate the management of Apache Cassandra clusters in a Kubernetes cluster. In this article, set up a multi-node Apache Cassandra cluster in a Vultr Kubernetes Engine (VKE) cluster using the K8ssandra Operator.

Prerequisites

Before you begin:

  • Deploy a Vultr Kubernetes Engine (VKE) with at least 4 nodes and 4 GB RAM per node
  • Deploy a Ubuntu Server on Vultr to use as the development machine
  • Access the server using SSHas a non-root user with sudo privileges
  • Install and Configure Kubectl to access the cluster
  • Using the Python pip package manager, install the Cassandra cqlsh CLI tool:CONSOLECopy$ pip3 install -U cqlsh
  • Install the Helm package manager:CONSOLECopy$ snap install helm –classic

Install Cert-Manager

Cert-Manager is a Kubernetes operator that manages and issues TLS/SSL certificates within a cluster from trusted authorities such as Let’s Encrypt. K8ssandra uses cert-manager to automate certificate management within a Cassandra clusters. This includes creating the Java keystores and truststores needed from the certificates. Follow the steps in this section to install the cert-manager resources required by the K8ssandra Operator.

  1. Using Helm, add the Cert-Manager Helm repository to your local repositories.CONSOLECopy$ helm repo add jetstack https://charts.jetstack.io
  2. Update the local Helm charts index.CONSOLECopy$ helm repo update
  3. Install Cert-Manager to your VKE cluster.CONSOLECopy$ helm install cert-manager jetstack/cert-manager –namespace cert-manager –create-namespace –set installCRDs=true
  4. When successful, verify that all Cert-Manager resources are available in the clusterCONSOLECopy$ kubectl get all -n cert-manager Output:NAME READY STATUS RESTARTS AGE pod/cert-manager-5f68c9c6dd-stmp6 1/1 Running 0 35h pod/cert-manager-cainjector-57d6fc9f7d-gwqr5 1/1 Running 0 35h pod/cert-manager-webhook-5b7ffbdc98-sq8kg 1/1 Running 0 35h NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cert-manager ClusterIP 10.102.38.47 <none> 9402/TCP 35h service/cert-manager-webhook ClusterIP 10.97.255.91 <none> 443/TCP 35h NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cert-manager 1/1 1 1 35h deployment.apps/cert-manager-cainjector 1/1 1 1 35h deployment.apps/cert-manager-webhook 1/1 1 1 35h NAME DESIRED CURRENT READY AGE replicaset.apps/cert-manager-5f68c9c6dd 1 1 1 35h replicaset.apps/cert-manager-cainjector-57d6fc9f7d 1 1 1 35h replicaset.apps/cert-manager-webhook-5b7ffbdc98 1 1 1 35h

Install the K8ssandra Operator

  1. Add the K8ssandra operator repository to your Helm sources.CONSOLECopy$ helm repo add k8ssandra https://helm.k8ssandra.io/stable
  2. Install the K8ssandra operator in your cluster.CONSOLECopy$ helm install k8ssandra-operator k8ssandra/k8ssandra-operator -n k8ssandra-operator –create-namespace
  3. Wait for at least 3 minutes and view the cluster deployment to verify that the K8ssandra operator is availableCONSOLECopy$ kubectl -n k8ssandra-operator get deployment Your output should look like the one below:NAME READY UP-TO-DATE AVAILABLE AGE k8ssandra-operator 0/1 1 1 10s k8ssandra-operator-cass-operator 0/1 1 1 10s
  4. Verify that the K8ssandra operator pods are ready and runningCONSOLECopy$ kubectl get pods -n k8ssandra-operator Your output should look like the one below:NAME READY STATUS RESTARTS AGE k8ssandra-operator-765bcf99bf-7jfmj 0/1 ContainerCreating 0 11s k8ssandra-operator-cass-operator-b9cc84556-hb6jv 0/1 Running 0 11s

Set Up a Multi-node Apache Cassandra Cluster on VKE

  1. Using a text editor such as nano, create a new manifest file cluster.yaml.CONSOLECopy$ nano cluster.yaml
  2. Add the following contents to the file.YAMLCopyapiVersion: k8ssandra.io/v1alpha1 kind: K8ssandraCluster metadata: name: demo spec: cassandra: serverVersion: “4.0.1” datacenters: – metadata: name: dc1 size: 3 storageConfig: cassandraDataVolumeClaimSpec: storageClassName: vultr-block-storage accessModes: – ReadWriteOnce resources: requests: storage: 10Gi config: jvmOptions: heapSize: 512M stargate: size: 1 heapSize: 256M Save and close the file.The above configuration file defines the Cassandra cluster configuration with the following values:
    • Cassandra version: 4.0.1
    • Three cluster worker nodes.
    • The vultr-block-storage storage class with a 10 GB volume size per PVC.
    • TheCassandra node JVM heap size is 512 MB.
    • The Stargate node JVM is allocated 256 MB heap.
  3. Apply the deployment to your clusterCONSOLECopy$ kubectl apply -n k8ssandra-operator -f cluster.yaml
  4. Wait for at least 15 minutes and view the cluster pods.CONSOLECopy$ kubectl get pods -n k8ssandra-operator –watch Verify that all pods are ready and running similar to the output below:NAME READY STATUS RESTARTS AGE demo-dc1-default-sts-0 0/2 Pending 0 3s demo-dc1-default-sts-1 0/2 Pending 0 3s demo-dc1-default-sts-2 0/2 Pending 0 3s k8ssandra-operator-765bcf99bf-7jfmj 1/1 Running 0 6m21s k8ssandra-operator-cass-operator-b9cc84556-hb6jv 1/1 Running 0 6m21sWhen all Cassandra database pods are ready, the Stargate Pod creation is initiated. Stargate provides a data gateway with REST, GraphQL, and Document APIs in front of the Cassandra database. The name of the Stargate Pod shoudl be similar to: demo-dc1-default-stargate-deployment-597b876d8f-559pt.

Verify the Linked Vultr Block Storage PVCs For Cassandra Cluster Persistence

The Kubernetes StatefulSet controller creates the Cassandra cluster pods which in turn is created by the K8ssandra operator during cluster deployment. StatefulSet is the key to data persistence of each node in the Cassandra cluster and there is one StatefulSet in each cluster node. Verify the available cluster PVCs to improve the Cassandra cluster persistence.

  1. Verify that the StatefulSets are available and ready in your clusterCONSOLECopy$ kubectl get statefulset -n k8ssandra-operator Output:NAME READY AGE demo-dc1-default-sts 3/3 15m
  2. Verify the available Storage class.CONSOLECopy$ kubectl get sc vultr-block-storage Output:NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE vultr-block-storage block.csi.vultr.com Delete Immediate true 36mAs displayed in the above output, Vultr offers both HDD and NVME block storage technologies.The Vultr Container Storage Interface (CSI) to connect your VKE cluster to Vultr block storage and deploy the high performance NVME class. It is automatically deployed by the managed control plane in the VKE cluster.
  3. To verify the deployed Vultr Block Storage volumes attached to your VKE cluster, view all cluster PVCs.CONSOLECopy$ kubectl get pvc –all-namespaces In addition, open the Vultr Customer Postal and navigate to the Linked Resources tab in your VKE cluster control panel:View VKE Cluster Linked ResourcesYou can also verify the linked volumes by navigating to your Vultr Account Block Storage Page.View Vultr Block Storage Volumes

Create a Kubernetes Service to Access the Cassandra cluster

  1. Create a new service resource file service.yaml.CONSOLECopy$ nano service.yaml
  2. Add the following contents to the file.YAMLCopyapiVersion: v1 kind: Service metadata: name: cassandra labels: app: cassandra spec: type: LoadBalancer externalTrafficPolicy: Local ports: – port: 9042 selector: statefulset.kubernetes.io/pod-name: demo-dc1-default-sts-1 Save and close the file.The above configuration defines a Kubernetes service with a LoadBalancer type to access Cassandra cluster using port 9042.
  3. Apply the service to your cluster.CONSOLECopy$ kubectl apply -n k8ssandra-operator -f service.yaml
  4. Wait for at least 5 minutes to deploy the cluster Loadbalancer resource and view the Cassandra serviceCONSOLECopy$ kubectl get svc/cassandra -n k8ssandra-operator Verify the IP Address in your External-IP value to use for accessing the clusterNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE cassandra LoadBalancer 10.110.178.86 <pending> 9042:32313/TCP 3s

Test the Apache Cassandra Cluster

cqlsh is a command-line interface that allows users to connect to a Cassandra cluster. Follow the steps in this section to execute CQL (Cassandra Query Language) statements, and perform various database operations, such as creating, modifying, and querying data.

  1. Export your Cassandra service Load Balancer IP to the CASS_IP variable.CONSOLECopy$ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath=”{.status.loadBalancer.ingress[*].ip}”) View the variable value.CONSOLECopy$ echo $CASS_IP
  2. Export the cluster access username to the CASS_USERNAME variable.CONSOLECopy$ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}’ | base64 –decode) View the username value.CONSOLECopy$ echo $CASS_USERNAME
  3. Export the cluster the password to the CASS_PASSWORD variable.CONSOLECopy$ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}’ | base64 –decode) View the password value.CONSOLECopy$ echo $CASS_PASSWORD
  4. Using cqlsh, log in to the Cassandra cluster using your variable values.CONSOLECopy$ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042
  5. Create a new keyspace demo in the Cassandra database.SQLCopy> CREATE KEYSPACE demo WITH replication = {‘class’: ‘SimpleStrategy’, ‘replication_factor’: 3};
  6. Create a new table users in the demo keyspace.SQLCopy> CREATE TABLE demo.users (id text primary key, name text, country text);
  7. Add data to the users table.SQLCopy> INSERT INTO demo.users (id, name, country) values (’42’, ‘John Doe’, ‘UK’); > INSERT INTO demo.users (id, name, country) values (’43’, ‘Joe Smith’, ‘US’);
  8. Query the table data to view the stored values.SQLCopy> SELECT * FROM demo.users; Output:id | country | name ----+---------+----------- 43 | US | Joe Smith 42 | UK | John Doe (2 rows)

Conclusion

You have deployed Apache Cassandra in a Vultr Kubernetes Engine (VKE) cluster using the open-source K8ssandra cluster. In addition, you set up Vultr Block Storage for data persistence and accessed the Cassandra cluster using the cqlsh CLI tool.

Introduction Apache Cassandra is an open-source distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It’s distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications.…

Introduction Apache Cassandra is an open-source distributed NoSQL database designed to handle large volumes of data across multiple commodity servers. It’s distributed architecture avoids single points of failure and enables horizontal scalability. Cassandra excels at write-heavy workloads and offers high write and read throughput, making it ideal for data-intensive applications.…

Leave a Reply

Your email address will not be published. Required fields are marked *