Wednesday, 17 November 2021

Elasticsearch : Primary Shards

 In Devops world, Elasticsearch technology is one of the key component which is being used widely. To strengthen the knowledge further here is short tip on one very important basic aspect which one must consider while designing.

Statement: Do you ever wonder in Elastic Search, Primary shards once created cannot be increased further unless you recreate the index.

Answer: One of the reason behind this is explained below:

shard = hash(routing) % number_of_primary_shards

The routing value here is just an arbitrary string, which actually defaults to the document’s _id but there is a provision to set it as custom value. Here routing string is passed through a hashing function that generate a number, which is then divided by the number of primary shards in the index to return the remainder. The remainder will be in the range 0 to number_of_primary_shards - 1, and gives us the number of the shard where a particular document lives.

So this actually explains why the number of primary shards can be set only when an index is created and never changed: if the number of primary shards ever altered in the future, all previous routing values would be invalid and documents would never be found.

So be very careful while designing the index.

Enjoy Learning!!!

1 comment:

Significance of chroot in Kafka & Zookeeper

Chroot you might have heard this keyword many times but due you know what it signifies? Brief : Chroot is actually termed as zookeeper.chroo...