New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update settings: Allow to dynamically update thread pool settings #2509
Comments
I frequently find that the queue_size is not picked up when using a live update. In my particular case it is for bulk.queue_size. Do all writes to the cluster need to completely stop to give ES time to switch over? |
@matthuhiggins could you share a bit more details here? Which version of elasticsearch are you using? How exactly do you try to change the bulk thread pool queue size? How do you verify that your changes were not picked up? |
I was using elasticsearch 1.4.0. I used the index update settings API to increase the bulk queue size capacity to 100. For up to 30 minutes after, I continued to see I tried both transient and persistent settings. The reason I think there is a bug is because of the case where I saw |
@matthuhiggins when queue size is updated elasticsearch retires old executor and replaces it with a new executor. So, this lingering error is only possible is some component is holding to the old instance of the executor and continues to reuse it instead of asking for a new one. I checked all places where we are using BULK thread pool and don't really see how it can continue clinging to the old instance for 30 minutes. Do you still have the log file with this error? I would love to see a complete stack trace to figure out who is hanging to this thread pool for so long. |
I do not, and I imagine it will be hard to retrace. The cluster returns the new settings immediately after the update, and my only guess is that something held reference to the older executor. (In my case, we had a 5 node cluster with 80 processes writing to it). |
@matthuhiggins what did you use to bulk index data into elasticsearch? |
Resque workers reading from a postgres table and sending that raw data to elasticsearch. (Each worker was told to work on a range of the data). Were you curious or will this help you debug? |
@matthuhiggins I was just trying to rule out a remote possibility that you used a plugin that was hoarding old bulk executors. |
Nope. It's just simple bulk requests over http. I imagine that the scenario could be reproduced with a multi-node cluster and sufficient load - if not, no worries. |
I had to do this as well with the bulk queue_size. In my scenario, I was using elastic search via the grails plugin. The plugin binds to the domain model and queues up index updates asynchronously. The application was performing an immense amount of database activity (importing a very large CSV file into database) and quickly surpassed the default queue_size of 50 (saw same error as @matthuhiggins. Using the Java API, I was able to update the queue size like so: ThreadPool threadPool = elasticSearchAdminService.elasticSearchHelper.elasticSearchClient.admin().cluster().threadPool() Note that "elasticSearchAdminService" is the injected grails service provided by the plugin. |
Allow to dynamically update thread pool settings. The settings can be updated using Cluster Update Settings API. Both pool type and pool parameters can be changed dynamically. Minor changes, such as number of threads or queue size, are made to the existing thread pool executor. To apply major changes such as thread pool type or queue type changes, Elasticsearch replaces the old thread executor with a new executor. When this happens Elasticsearch creates a new thread pool first and starts executing all new tasks using the new pool, meanwhile all tasks that are currently executed in the old pool are allowed to finish before the old thread pool is stopped.
The text was updated successfully, but these errors were encountered: