Steps to add a new datacenter to a cluster in Cassandra

In this blog post, I have explained the steps to add a new datacenter to a cluster in Cassandra.

Before adding a new Vnode or DC to Cassandra cluster, firstly we need to understand the reason behind adding this :

 You might want to consider adding a new node if you have :

a.Reached data capacity problem:
– Your data has outgrown the node’s hardware capacity.

b.Reached traffic capacity :
-Your application needs a more rapid response with less latency.

c.To increase operational headroom:
-Need more resources for node repair .compaction and other resource-intensive operations.

Adding nodes: Best Practices

a. Adding a single node at a time will:
– Result in  more data movement
-Will have a gradual impact on cluster performance.

Kindly follow the below steps to add a new datacenter to a cluster in Cassandra.

1. Install Cassandra on the new host. We have to make sure that the version should be the same as installed on the other nodes in the cluster. But do not start the Cassandra service on another host machine.

2. Copy the following config file from other nodes in the cluster or you can configure the parameter by similar settings of other nodes in the cluster.

———————
cassandra-env.sh
cassandra.yaml
cassandra-rackdc.properties
———————

Kindly modify below parameters in “Cassandra.yaml” in case if you are planning to use existing config files of  a new node:

a.Seeds:

This should include nodes from live DC because new nodes have to stream data from them. seeds – Determines which nodes the new node contacts to learn about the cluster and establish the gossip process. Make sure that the -seeds list includes the address of at least one node in the existing cluster.

b.cluster_name: Kindly keep this parameter value similar to the nodes in another live DC.

c.auto_bootstrap:

This property is not listed in the default cassandra.yaml configuration file, but it might have been added and set to false by other operations. If it is not defined in cassandra.yaml, Cassandra uses true as a default value. For this operation, search for this property in the cassandra.yaml file. If it is present, set it to true or delete it.

Bootstrapping is the process of a new node joining the cluster:.During the bootstrapping process it follows the below steps:

-The joining node contacts the seed node.
-The seed node cluster information, including token ranges, to the joining node.
-Cluster nodes prepare to stream necessary SSTables.
-Cluster nodes stream SSTables to the joining node
-Existing cluster node continues to satisfy writes, but also forward write to joining node.
-When streaming is complete, joining node changes to the normal state and handles read/write requests.

d.snitch: Keep it similar to the nodes in live DC.

Apart from setting the above parameters we also need to configure below parameters on the new DC :

listen_address: IP address of a new node
rpc_address: IP address of a new node
data_directory: Directory structure of a new node
saved_cache_directory: Directory structure of a new node
commitlog_directory: Directory structure of a new node

Set the below parameters for the new datacenter and rack in “Cassandra-rackdc.properties” file:

dc=<datacenter name>
rack=<rack name>

3. Once the above configuration is done then the next step is to start the cluster and verify that the node is fully bootstrapped and all other nodes are up (UN) and not in any other state. We can verify the same using “nodetool status” command.

4. Check the Cassandra logs for any errors on the new node. Whenever bootstrapping fails there can be two scenarios:

a. Bootstrapping node could not connect to the cluster:
-Examine the log file to understand whats going on.
-Change config and try again

b.Streaming portion fails:
-If restarting fails, try detecting data directories and rebooting
-In the worst case, remove the node from the cluster and try again.

4. After all new nodes are running, we need to run “nodetool cleanup” on each of the previously existing nodes to remove the keys that no longer belong to those nodes. Wait for cleanup to complete on one node before running nodetool cleanup on the next node.

During “nodetool cleanup” it reads all SSTable to make sure there is no token out of the range for that particular node.If the SSTable is not out of range then the cleanup process just does a copy.

I hope u will find this post very useful.  🙂

Leave a comment