New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in src/rdb_protocol/protocol.cc at line 737: Guarantee failed: [out->skey_version == resp->skey_version] #3976
Comments
@pilt sorry you had to run into this. Thank you for filing a bug report. A few questions:
We might ask for a copy of your data files later for tracking down the issue. If it's possible, it would be awesome if you could shut down the server(s) and copy the RethinkDB data directory for later inspection before applying below work-around. If the crash happens repeatedly, I suggest rebuilding all secondary indexes as a work-around. You can do that without downtime by following these steps:
r.db("d").table("t").indexCreate("i_NEW", r.db("d").table("t").indexStatus("i")(0)("function"))
r.db("d").table("t").indexRename("i_NEW", "i", {overwrite: true}) @mlucy Can you look into this please? |
Just ran into this myself, it didn't happen as part of an upgrade though. I was playing around creating and deleting the same secondary index via the Web UI. I had just deleted the secondary index, then re-created it w/ slightly different definition. I went to query it (again via Web UI) and it said the index hadn't finished construction yet. So I tried the query again and the server crashed. There were no other queries running on the server. Now whenever I try to access that index, the server crashes, so totally 100% reproducible with my data set when running a certain query, I assume the index is caught in some bad state? Maybe it wasn't fully deleted yet when I went to re-create it? Happy to send my data directory to someone if you like! Here's my backtrace if it is helpful:
|
Actually, maybe not 100% reproducible, earlier I started the server back up (after the first crash) and re-ran the query and it blew up on me again, but after I made a sandwich and came back and started the server back up for a second time, the secondary index appears to be working now. Weird. I gzipped the directory in case you want it. |
Shortly before we got this error we added a new index (on 1.16) and tried to use it before it was finished creating.
Just to clarify, what I meant earlier by that we had rebuilt all indexes was that we got warnings about indexes that needed to be rebuilt after we upgraded to 1.16. We had already done that when this error occurred. We have not had this error again after the rethink daemon was restarted. It's a clustered setup running on AWS. |
Could just be a coincidence but my index also involved date ranges! On Wednesday, March 25, 2015, Simon Pantzare notifications@github.com
|
Thanks @chrisvariety and @pilt ! |
Thanks for the report! It looks like this bug occurs when you try and read from an sindex that is ready on some shards but not others. There's a fix in CR 2745 by @danielmewes . |
The fix is in next, 2.0.x, and 1.16.x. @danielmewes -- should we do a point release for this? |
Even though this should rarely happen during normal operation, I think we should do a 1.16 point release since it's still a nasty crash. @AtnNn can you start the necessary steps please? The tested work around until then is to make a sandwich to give secondary index construction time to complete ;-) |
Fast work guys as usual! There's no sandwich emoji but here's a cookie: 🍪 |
Impressed by how quickly you handled this, many thanks from Narrative! |
@pilt @chrisvariety The newly released version, 1.16.3, contains a fix for this issue. |
you're the best ⭐ |
Awesome. :) ✨ 🍰 ✨ |
This happened after an upgrade from 1.15 to 1.16 (1.16.2+1~0trusty). We rebuilt indexes long before this happened if that's relevant.
Log file:
Backtrace with newlines:
Let me know if I can provide more information.
The text was updated successfully, but these errors were encountered: