Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guarantee failed: [pair.second] key for entry_t already exists #4968

Closed
mbroadst opened this issue Oct 19, 2015 · 12 comments
Closed

Guarantee failed: [pair.second] key for entry_t already exists #4968

mbroadst opened this issue Oct 19, 2015 · 12 comments
Assignees
Labels
Milestone

Comments

@mbroadst
Copy link
Contributor

2015-10-19T22:08:15.034810106 430.876151s error: Error in ./src/concurrency/watchable_map.tcc at line 98:
2015-10-19T22:08:15.034938779 430.876279s error: Guarantee failed: [pair.second] key for entry_t already exists
2015-10-19T22:08:15.034965698 430.876305s error: Backtrace:
2015-10-19T22:08:15.242320003 431.083664s error: Mon Oct 19 22:08:15 2015
1 [0x9e8de0]: backtrace_t::backtrace_t() at ??:?
2 [0x9e9173]: format_backtrace(bool) at ??:?
3 [0x9c5f85]: report_fatal_error(char const*, int, char const*, ...) at ??:?
4 [0xa8452e]: watchable_map_var_t<raft_member_id_t, raft_business_card_t<table_raft_state_t> >::entry_t::entry_t(watchable_map_var_t<raft_member_id_t, raft_business_card_t<table_raft_state_t> >*, raft_member_id_t const&, raft_business_card_t<table_raft_state_t> const&) at ??:?
5 [0xa84a60]: watchable_map_keyed_var_t<peer_id_t, raft_member_id_t, raft_business_card_t<table_raft_state_t> >::set_key(peer_id_t const&, raft_member_id_t const&, raft_business_card_t<table_raft_state_t> const&) at ??:?
6 [0xa70911]: table_manager_t::on_table_directory_change(std::pair<peer_id_t, uuid_u> const&, table_manager_bcard_t const*) at ??:?
7 [0xa1458d]: watchable_map_t<std::pair<peer_id_t, uuid_u>, table_manager_bcard_t>::notify_change(std::pair<peer_id_t, uuid_u> const&, table_manager_bcard_t const*, rwi_lock_assertion_t::write_acq_t*) at ??:?
8 [0xa183eb]: watchable_map_var_t<std::pair<peer_id_t, uuid_u>, table_manager_bcard_t>::set_key_no_equals(std::pair<peer_id_t, uuid_u> const&, table_manager_bcard_t const&) at ??:?
9 [0xa18725]: directory_map_read_manager_t<uuid_u, table_manager_bcard_t>::do_update(peer_id_t, auto_drainer_t::lock_t, auto_drainer_t::lock_t, unsigned long, uuid_u const&, boost::optional<table_manager_bcard_t> const&) at ??:?
10 [0xa106fd]: callable_action_instance_t<boost::_bi::bind_t<void, boost::_mfi::mf6<void, directory_map_read_manager_t<uuid_u, table_manager_bcard_t>, peer_id_t, auto_drainer_t::lock_t, auto_drainer_t::lock_t, unsigned long, uuid_u const&, boost::optional<table_manager_bcard_t> const&>, boost::_bi::list7<boost::_bi::value<directory_map_read_manager_t<uuid_u, table_manager_bcard_t>*>, boost::_bi::value<peer_id_t>, boost::_bi::value<auto_drainer_t::lock_t>, boost::_bi::value<auto_drainer_t::lock_t>, boost::_bi::value<unsigned long>, boost::_bi::value<uuid_u>, boost::_bi::value<boost::optional<table_manager_bcard_t> > > > >::run_action() at ??:?
11 [0x70b228]: coro_t::run() at ??:?

2015-10-19T22:08:15.242645537 431.083986s error: Exiting.
@danielmewes danielmewes added this to the 2.1.x milestone Oct 19, 2015
@danielmewes
Copy link
Member

Looks similar to #4272 , but the backtrace is different. So this might be a separate issue.

(I suspect by the way that this only happens if a server is disconnected and reconnects really quickly. Does that sounds plausible @mbroadst ?)

@mbroadst
Copy link
Contributor Author

Yes that is very likely the case, though this happened in the middle of aggressive testing for another issue we're working out (non-rethink related) - it took a while to realize that rethink had gone down on this machine so the details around what caused the crash are fuzzy. I'm installing the debug symbols on the three machines to try to get you a more useful backtrace if it happens again.

@mbroadst
Copy link
Contributor Author

@danielmewes isn't the solution here to use the raft id rather than the peer id with the raft_directory here? raft id should change on each restart, whereas if you depend on the peer_id_t that will be the same (unless I misunderstand)

@danielmewes
Copy link
Member

@mbroadst I think it's the opposite actually. The peer id changes on each restart (actually on each connection). I think the raft ID remains the same, though I'm not 100% sure on that one (@VeXocide might know for sure. He's also working on a fix for this now).

@mbroadst
Copy link
Contributor Author

@danielmewes Okay I was going off this documentation. I'm speaking with @VeXocide at the moment actually, but I'm being a pest 😄

@danielmewes
Copy link
Member

@mbroadst Ah, when it talks about "joining the Raft cluster" it actually means something different from connecting to other RethinkDB instances. A server joins the Raft cluster for a given table when it is made a replica for some of the shards of the table, and leaves it when you change the table configuration so that it's no longer a replica.

@VeXocide VeXocide self-assigned this Oct 20, 2015
@VeXocide
Copy link
Member

In CR 3287 by @danielmewes.

@mbroadst
Copy link
Contributor Author

👍

any word as to whether this will be backported into 2.1.x, or is the fix strictly for next?

@danielmewes
Copy link
Member

@mbroadst this will be backported.

@VeXocide
Copy link
Member

Merged into next and v2.1.x via commits 55d6e0f and 3bd98ec respectively.

@mbroadst
Copy link
Contributor Author

sooo many beers

@VeXocide
Copy link
Member

🍻

@danielmewes danielmewes modified the milestones: 2.1.x, 2.2 Nov 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants