Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: mon crashes when "ceph osd tree 85 --format json" #4936

Merged
merged 8 commits into from Jul 10, 2015

Conversation

tchaikov
Copy link
Contributor

@tchaikov tchaikov added this to the hammer milestone Jun 12, 2015
@ghost ghost assigned theanalyst Jun 12, 2015
@ghost ghost added bug-fix core labels Jun 12, 2015
@ktdreyer
Copy link
Member

I think the make-check bot failure above is spurious. Can you please re-push so it will trigger a new build attempt?

ktdreyer referenced this pull request Jul 2, 2015
* as a side effect, this change silences
  http://tracker.ceph.com/issues/11576

Fixes: #11576
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e7b196a)
* check for dangling bucket name or type names referenced by the
  buckets/items in the crush map.
* also check for the references from Item(0, 0, 0) which does not
  necessarily exist in the crush map under testing. the rationale
  behind this is: the "ceph osd tree" will also print stray OSDs
  whose id is greater or equal to 0. so it would be useful to
  check if the crush map offers the type name indexed by "0"
  (the name of OSDs is always "OSD.{id}", so we don't need to
  look up the name of an OSD item in the crushmap).

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit b75384d)
* so one is able to verify that the "ceph osd tree" won't chock on the
  new crush map because of dangling name/type references

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d6b46d4)
* the "osd tree dump" command enumerates all buckets/osds found in either the
  crush map or the osd map. but the newly set crushmap is not validated for
  the dangling references, so we need to check to see if any item in new crush
  map is referencing unknown type/name when a new crush map is sent to
  monitor, reject it if any.

Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit a955f36)
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit e640d89)
add an argument "max_id" for "--check-names" to check if any item
has an id greater or equal to given "max_id" in crush map.

Note: edited since we do not have the fix introduced in 46103b2 in
      hammer.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit d0658dd)
Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 22e6bd6)
* because "--check" also checks for the max_id

Note: edited since we do not have the fix introduced in 46103b2 in
      hammer.

Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit 9381d53)
so we don't need to call CrushTester::check_name_maps() in OSDMonitor.cc
anymore.

Fixes: #11680
Signed-off-by: Kefu Chai <kchai@redhat.com>
(cherry picked from commit c6e6348)
@tchaikov
Copy link
Contributor Author

@ktdreyer @theanalyst removed the commit included in #5122 from this pr and repushed.

@tchaikov
Copy link
Contributor Author

since this pr has been tested per http://tracker.ceph.com/issues/11990, before the commit from #5122 was removed. it's good to merge along with #5122 .

tchaikov added a commit that referenced this pull request Jul 10, 2015
mon crashes when "ceph osd tree 85 --format json"

Reviewed-by: Kefu Chai <kchai@redhat.com>
@tchaikov tchaikov merged commit 7f1fb57 into hammer Jul 10, 2015
@ghost
Copy link

ghost commented Jul 10, 2015

It looks like the bot failure is an actual problem. See also http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-trusty-i386-basic/log.cgi?log=552772025cb8d5f51ffb3a069d1bd93bc73f1123. I think to remember a pull request fixed racing code in ceph-helpers or something. I'll dig into this.

tchaikov pushed a commit to tchaikov/ceph that referenced this pull request Jul 11, 2015
* Back in Hammer, the osd-crush.sh individual tests did not run the
  monitor, it was taken care of by the run() function. An attempt to run
  another mon fails with:

  error: IO lock testdir/osd-crush/a/store.db/LOCK: Resource temporarily
  unavailable

  This problem was introduced by cc1cc03
  from ceph#4936
* replace test/mon/mon-test-helpers.sh with test/ceph-helpers.sh as
  we need run_osd() in this newly added test

http://tracker.ceph.com/issues/11975 Refs: ceph#11975

Signed-off-by: Loic Dachary <ldachary@redhat.com>
tchaikov pushed a commit to tchaikov/ceph that referenced this pull request Jul 11, 2015
* Back in Hammer, the osd-crush.sh individual tests did not run the
  monitor, it was taken care of by the run() function. An attempt to run
  another mon fails with:

  error: IO lock testdir/osd-crush/a/store.db/LOCK: Resource temporarily
  unavailable

  This problem was introduced by cc1cc03
  from ceph#4936
* replace test/mon/mon-test-helpers.sh with test/ceph-helpers.sh as
  we need run_osd() in this newly added test
* update the run-dir of commands: ceph-helpers.sh use the different
  convention for the run-dir of daemons.

http://tracker.ceph.com/issues/11975 Refs: ceph#11975

Signed-off-by: Loic Dachary <ldachary@redhat.com>
@ghost ghost changed the title mon crashes when "ceph osd tree 85 --format json" mon: mon crashes when "ceph osd tree 85 --format json" Aug 4, 2015
@tchaikov tchaikov deleted the wip-11975-hammer branch August 11, 2015 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants