Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update settings: Allow to dynamically update thread pool settings #2509

Closed
imotov opened this issue Dec 27, 2012 · 10 comments
Closed

Update settings: Allow to dynamically update thread pool settings #2509

imotov opened this issue Dec 27, 2012 · 10 comments

Comments

@imotov
Copy link
Contributor

imotov commented Dec 27, 2012

Allow to dynamically update thread pool settings. The settings can be updated using Cluster Update Settings API. Both pool type and pool parameters can be changed dynamically. Minor changes, such as number of threads or queue size, are made to the existing thread pool executor. To apply major changes such as thread pool type or queue type changes, Elasticsearch replaces the old thread executor with a new executor. When this happens Elasticsearch creates a new thread pool first and starts executing all new tasks using the new pool, meanwhile all tasks that are currently executed in the old pool are allowed to finish before the old thread pool is stopped.

@matthuhiggins
Copy link

I frequently find that the queue_size is not picked up when using a live update. In my particular case it is for bulk.queue_size. Do all writes to the cluster need to completely stop to give ES time to switch over?

@imotov
Copy link
Contributor Author

imotov commented Jan 25, 2015

@matthuhiggins could you share a bit more details here? Which version of elasticsearch are you using? How exactly do you try to change the bulk thread pool queue size? How do you verify that your changes were not picked up?

@matthuhiggins
Copy link

I was using elasticsearch 1.4.0. I used the index update settings API to increase the bulk queue size capacity to 100. For up to 30 minutes after, I continued to see [rejected execution (queue capacity 50) .. for bulk requests. It was eventually picked up. (I unfortunately then started seeing queue capacity 100 errors. I set the queue capacity to -1 and restarted the cluster because I continued to see errors with capacity of 100).

I tried both transient and persistent settings. The reason I think there is a bug is because of the case where I saw queue capacity 50 errors for up to 30 minutes after the change. Without any changes on my part, the setting was finally picked up after all index writes were stopped.

@imotov
Copy link
Contributor Author

imotov commented Jan 28, 2015

@matthuhiggins when queue size is updated elasticsearch retires old executor and replaces it with a new executor. So, this lingering error is only possible is some component is holding to the old instance of the executor and continues to reuse it instead of asking for a new one. I checked all places where we are using BULK thread pool and don't really see how it can continue clinging to the old instance for 30 minutes. Do you still have the log file with this error? I would love to see a complete stack trace to figure out who is hanging to this thread pool for so long.

@matthuhiggins
Copy link

I do not, and I imagine it will be hard to retrace. The cluster returns the new settings immediately after the update, and my only guess is that something held reference to the older executor. (In my case, we had a 5 node cluster with 80 processes writing to it).

@imotov
Copy link
Contributor Author

imotov commented Jan 28, 2015

@matthuhiggins what did you use to bulk index data into elasticsearch?

@matthuhiggins
Copy link

Resque workers reading from a postgres table and sending that raw data to elasticsearch. (Each worker was told to work on a range of the data). Were you curious or will this help you debug?

@imotov
Copy link
Contributor Author

imotov commented Jan 28, 2015

@matthuhiggins I was just trying to rule out a remote possibility that you used a plugin that was hoarding old bulk executors.

@matthuhiggins
Copy link

Nope. It's just simple bulk requests over http. I imagine that the scenario could be reproduced with a multi-node cluster and sufficient load - if not, no worries.

@djdenv
Copy link

djdenv commented Feb 25, 2015

I had to do this as well with the bulk queue_size. In my scenario, I was using elastic search via the grails plugin. The plugin binds to the domain model and queues up index updates asynchronously. The application was performing an immense amount of database activity (importing a very large CSV file into database) and quickly surpassed the default queue_size of 50 (saw same error as @matthuhiggins. Using the Java API, I was able to update the queue size like so:

ThreadPool threadPool = elasticSearchAdminService.elasticSearchHelper.elasticSearchClient.admin().cluster().threadPool()
threadPool.updateSettings(ImmutableSettings.settingsBuilder().put("threadpool.bulk.queue_size", "1000").build())

Note that "elasticSearchAdminService" is the injected grails service provided by the plugin.
Hope this is helpful to someone...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants