How to add new Vnode to the existing Datacenter in Cassandra Cluster :

In this blog post, I have explained the steps to add a new Virtual node (Vnode) to the existing Datacenter in Cassandra Cluster.

Kindly follow below steps to add a new Vnode to the  existing Datacenter :  

1. Install Cassandra on the new host. We have to make sure that the version should be the same as installed on the other nodes in the cluster. But do not start the Cassandra service on another host machine.

2. Copy the following config file from other nodes in the cluster or you can configure the parameter by similar settings of other nodes in the cluster.

———————
cassandra-env.sh
cassandra.yaml
cassandra-rackdc.properties
———————

Kindly modify below parameters in “Cassandra.yaml” in case if you are planning to use existing config files of  a new node:

a.Seeds:

This should include nodes from live DC because new nodes have to stream data from them. seeds – Determines which nodes the new node contacts to learn about the cluster and establish the gossip process. Make sure that the -seeds list includes the address of at least one node in the existing cluster.

b.cluster_name: Kindly keep this parameter value similar to the nodes in another live DC.

c.auto_bootstrap:

This property is not listed in the default cassandra.yaml configuration file, but it might have been added and set to false by other operations. If it is not defined in cassandra.yaml, Cassandra uses true as a default value. For this operation, search for this property in the cassandra.yaml file. If it is present, set it to true or delete it.

Bootstrapping is the process of a new node joining the cluster:.During the bootstrapping process it follows the below steps:

-The joining node contacts the seed node.
-The seed node cluster information, including token ranges, to the joining node.
-Cluster nodes prepare to stream necessary SSTables.
-Cluster nodes stream SSTables to the joining node
-Existing cluster node continues to satisfy writes, but also forward write to joining node.
-When streaming is complete, joining node changes to the normal state and handles read/write requests.

Apart from setting the above parameters we also need to configure below parameters on the new node :

listen_address: IP address of a new node
rpc_address: IP address of a new node
data_directory: Directory structure of a new node
saved_cache_directory: Directory structure of a new node
commitlog_directory: Directory structure of a new node

3. Once the above configuration is done then the next step is to start the cluster and verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state. We can verify the same using “nodetool status” command.

4. Check the Cassandra logs for any errors on the new node. Whenever bootstrapping fails there can be two scenarios:

a. Bootstrapping node could not connect to the cluster:
-Examine the log file to understand whats going on.
-Change config and try again

b.Streaming portion fails:
-If restarting fails, try detecting data directories and rebooting
-In the worst case, remove the node from the cluster and try again.

4. After all new nodes are running, we need to run “nodetool cleanup” on each of the previously existing nodes to remove the keys that no longer belong to those nodes. Wait for cleanup to complete on one node before running nodetool cleanup on the next node.

During “nodetool cleanup” it reads all SSTable to make sure there is no token out of the range for that particular node.If the SSTable is not out of range then the cleanup process just does a copy.

I hope u will find this post very useful.  🙂

Leave a comment