Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helgrind: fix real (and imaginary) race conditions #7208

Merged
merged 9 commits into from Jan 14, 2016

Conversation

dillaman
Copy link

No description provided.

@liewegas
Copy link
Member

Does this mean we can/should rip out lockdep? I'm pretty sure it's broken anyway.

@dillaman
Copy link
Author

lockdep is orders of magnitude faster than helgrind. All librbd unit tests run with lockdep enabled -- helgrind didn't catch any librbd bugs not already previously fixed by lockdep reports.

@@ -67,6 +70,7 @@ SimpleMessenger::~SimpleMessenger()
assert(!did_bind); // either we didn't bind or we shut down the Accepter
assert(rank_pipe.empty()); // we don't have any running Pipes.
assert(!reaper_started); // the reaper thread is stopped
ceph_spin_destroy(&global_seq_lock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you help to do this in async messenger too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were exposed running librbd test cases -- it would probably open a whole new set of issues to helgrind ceph-mon/ceph-osd (with or without running w/ async messenger).

@@ -1342,6 +1344,7 @@ void Pipe::fault(bool onread)
return;
}

recv_reset();
shutdown_socket();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safe to leave out recv_reset() for the other two callsites of shutdown_socket()? seems like this could use an explanation in its own commit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other two sites were for shutdown cases -- so no worries if you leave state behind. I can move to a new commit.

Jason Dillaman added 9 commits January 13, 2016 09:54
Provides necessary support for annotating helgrind
false-positives.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Fixes: ceph#14163
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Derived classes should be responsible for registration to avoid
accessing a partially constructed object.

Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Jason Dillaman <dillaman@redhat.com>
jdurgin added a commit that referenced this pull request Jan 14, 2016
helgrind: fix real (and imaginary) race conditions

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
@jdurgin jdurgin merged commit 0108b9a into ceph:jewel Jan 14, 2016
@jdurgin
Copy link
Member

jdurgin commented Jan 14, 2016

Added issues for remaining potential bugs discovered:
http://tracker.ceph.com/issues/14365
http://tracker.ceph.com/issues/14366
http://tracker.ceph.com/issues/14367

@dillaman dillaman deleted the wip-14163-jewel branch January 14, 2016 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants