Delayed allocation causing partial allocation of shards on allocation awareness #14010
Labels
>bug
:Distributed/Allocation
All issues relating to the decision making around placing a shard (both master logic & on the nodes)
help wanted
adoptme
It is difficult to write out a full repro in words, so I recorded a video of the repro which will help.
The test uses the latest 1.7.2 release.
In short, 6 nodes in cluster, 1 index with 4 shards and 2 replicas (3 copies).
Each node has 2 awareness attributes (updateDomain and faultDomain) set (both forced). 3 nodes are in 1 updateDomain, the other 3 are in the other updateDomain. And these nodes are also in different faultDomains. Test has delayed allocation set to 10s for quicker allocation.
When an updateDomain is killed (3 nodes gone), the cluster shows partial allocation of shards - until a manual _cluster/reroute command is run (without post body) to prod it, or if a command is issued that updates the cluster state (eg. create an index). Once a manual reroute (that doesn't change anything) is run or the cluster state is updated, then the remaining shards are immediately allocated successfully based on the awareness settings.
If delayed allocation is turned off entirely, then everything works fine and there is no need to manually prod it to complete the rest of the allocation.
Note that sometimes, with delayed allocation on, it does do the right thing, but if you retest a few times stopping and restarting the 3 nodes, you will see that it doesn't do so consistently.
Repro video:
https://drive.google.com/file/d/0B1rxJ0dAZbQvRUE0SlVxT2pOZFE/view?usp=sharing
Node setup:
https://docs.google.com/document/d/1J5FPSvIA5U41Ou1BNpEN9P7q2L8e7KMxM69IG4dGMkk/edit?usp=sharing
The text was updated successfully, but these errors were encountered: