New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: fix temp object removal after upgrade #6976
Conversation
This was manually test. I'm not sure how to add a Hammer upgrade test which would leave stray temp objects in order to verify that they are cleaned up. |
I think this is being caught by the existing tests, see http://tracker.ceph.com/issues/13381 |
build_scrub_map_chunk( | ||
map, start, end, msg->deep, msg->seed, | ||
map, msg->start, msg->end, msg->deep, msg->seed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we can remove this. The problem was with a primary OSD running hammer and a replica running new code. In that case, we need to specify bounds that fall within the PG (hammer would fill in some fields but not pool, which would either assert or skip items, i forget which).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my change the start already has the pool number in it being passed from caller. The only affect my code has is that as you noted above I didn't set "scrubber.end." I couldn't see a reason for it, so I left it out. As a matter of fact I always saw end shown as MAX in the log even when I set the pool there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking that this code runs on the replicas, so now I see what you mean about interacting with older releases. I'll put this back.
I'll just check "q->hobj.get_pool() == -1" as the entire check and not use is_meta() because we don't want to break if is_meta() changes in the future. We are checking for an old release remnant at that point. I was being paranoid not to confuse the is_pgmeta() object (possibly an older one) from the pool == -1 temporary objects from prior releases. So the name.empty() check is probably not be needed because if pool == -1 it will never also be an is_pgmeta() object.
hobject_t::is_meta() looks like this:
|
Fixes: ceph#13862 Signed-off-by: David Zafman <dzafman@redhat.com>
Remove redundant dout() Signed-off-by: David Zafman <dzafman@redhat.com>
Is that last patch needed, now that the temp objects get cleaned up on startup? |
@liewegas I think the reason for the last commit has changed. If we set the pool, then during scrubbing the FileStore::collection_list() as an example, doesn't waste cycles reading the temp collection only to have the list pruned by PGBackend::objects_list_range(() called by PG::build_scrub_map_chunk(). FileStore::collection_list():
if (!c.is_temp() && !c.is_meta()) {
if (cmp_bitwise(start, sep) < 0) { // bitwise vs nibble doesn't matter here
dout(10) << __func__ << " first checking temp pool" << dendl;
coll_t temp = c.get_temp();
int r = collection_list(temp, start, end, sort_bitwise, max, ls, next); |
573ad2e
to
1727b40
Compare
@liewegas I changed the commit message on the last commit and removed the revert of PG::replica_scrub() change which set pool for scrubbing with hammer OSDs. |
Signed-off-by: David Zafman <dzafman@redhat.com>
Ok, I think this'll work! And it fixes your original issue (the second commit does i guess), right? |
@liewegas Yes, the second commit causes the old temp objects to be removed. |
osd: fix temp object removal after upgrade http://pulpito.ceph.com/sage-2016-01-20_17:01:32-upgrade:hammer-x-wip-sage-testing---basic-smithi/ Reviewed-by: Sage Weil <sage@redhat.com>
Various scrub fixes:
Fix 13862 by removing old Hammer temp objects.
Clean up a log message.
Skip temporary objects on both primary and replicas.