Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: fix reuse of osd ids (clear osd info on osd deletion) #6900

Merged
merged 2 commits into from Dec 18, 2015

Conversation

liewegas
Copy link
Member

When an OSD id is removed via ceph osd rm, it will be reused by the next
ceph osd create command. Verify that and OSD reusing such an id
successfully comes up.

http://tracker.ceph.com/issues/13988 Refs: ceph#13988

Signed-off-by: Loic Dachary <loic@dachary.org>
@ghost ghost added bug-fix core labels Dec 11, 2015
@ghost ghost self-assigned this Dec 11, 2015
@liewegas
Copy link
Member Author

pg_temp is not osd-based, and the mon cleans it up. yes on primary affinity. same for state. repushing!

If we destroy an OSD in the map, clear not just the uuid but also
all the metadata about it.

Specifically, we care about up_from, which can prevent a new
OSD from booting if it starts with a map prior to the deletion
when it sends it boot.  Specifically, the osd epoch may be 0 and
if the latest osd epoch is also small the osd decide it is "close
enough" to the latest epoch and sends the boot message.  In
practice this problem wouldn't surface on any cluster that isn't
brand new.

Note that this changes the result of applying an incremental.
As such, it will cause lots of old OSDs to request full maps
from the mon, spiking load during an upgrade.  This is as it
should be.

Fixes: ceph#13988
Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas liewegas force-pushed the wip-13988 branch 2 times, most recently from 5ae5e0d to 4e28f9e Compare December 11, 2015 16:54
@ghost ghost added the needs-qa label Dec 11, 2015
@ghost
Copy link

ghost commented Dec 11, 2015

Reviewed-by: Loic Dachary <ldachary@redhat.com>

@ghost
Copy link

ghost commented Dec 11, 2015

Running the ceph-disk suite which hits the bug every time on master

teuthology-openstack --verbose --key-filename ~/Downloads/myself --key-name myself --teuthology-git-url http://github.com/dachary/teuthology --teuthology-branch wip-suite --ceph-qa-suite-git-url http://github.com/ceph/ceph-qa-suite --suite-branch master --ceph-git-url http://github.com/liewegas/ceph --ceph wip-13988 --suite ceph-disk --filter ubuntu_14.04

@ghost
Copy link

ghost commented Dec 11, 2015

@liewegas could you update the commit message regarding the upgrade ? Since it has no influence on anything user visible, I assume the merge commit can be empty and not be highlighted in the release notes.

@ghost
Copy link

ghost commented Dec 11, 2015

@liewegas the bot failure is from http://tracker.ceph.com/issues/13986 and should be ignored.

liewegas added a commit that referenced this pull request Dec 18, 2015
mon: fix reuse of osd ids (clear osd info on osd deletion)

Reviewed-by: Loic Dachary <ldachary@redhat.com>
@liewegas liewegas merged commit 3aa9eb3 into ceph:master Dec 18, 2015
@liewegas liewegas deleted the wip-13988 branch December 18, 2015 20:07
@ghost ghost changed the title fix reuse of osd ids mon: fix reuse of osd ids (clear osd info on osd deletion) Feb 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants