Rebalancing data across different shards

August 8, 2014 . Comments
Tags: java, Rebalancing, JVM

This is a continuation of the post how to split keys evenly across different buckets/machines.

In this we will be looking at how to rebalance the buckets when we want to increase a bucket size as this operation is a little trickier than simple array copy.

I had outlined the code below.

There are 4 steps involved in rebalancing

Set the rebalancing in progress flag to true so the get operation can look for a key in the new bucket when not available on the old one.
Set the bucket size value to the new bucket size. We maintain a variable m_oldBucketSize for the get operation fallback logic.
Loop through the iterator and copy all the elements that need to go to new bucket.
Loop through the iterator and remove all elements from old bucket.

Steps 3 and 4 are done separately to provide consistent responses to get and put operations.

There is a slight change in the get operation compared to the previous blog.

Sample code that shows how the rebalancing would be effective and would work.

Output:


4 - key Belongs to bucket0
1 - key Belongs to bucket1
5 - key Belongs to bucket1
2 - key Belongs to bucket2
3 - key Belongs to bucket3
result after rebalance
5 - key Belongs to Bucket0
1 - key Belongs to bucket1
2 - key Belongs to bucket2
3 - key Belongs to bucket3
4 - key Belongs to Bucket4

Comments Section

Feel free to comment on the post but keep it clean and on topic.