Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: do not keep ref of old osdmap in pg #7007

Merged
merged 1 commit into from Jan 3, 2016
Merged

Conversation

tchaikov
Copy link
Contributor

do not hold a strong reference to last_persisted_osdmap in PG. as
an OSD tries to trim previously persisted osdmaps, if they are
not referenced anymore. this helps to keep the meta collection
in a manageable size. if we advance the osdmap many times, and
some PGs are not impacted by these changes, it's very likely that
they are still holding very old osdmap references in their
last_persisted_osdmap_ref. this practically prevents the OSD
from removing the out-dated osdmap in OSD::handle_osd_map(). so,
instead of holding a reference of last_persisted_osdmap, we can
simply remember its epoch.

Fixes: #13990
Signed-off-by: Kefu Chai kchai@redhat.com

@tchaikov
Copy link
Contributor Author

tested locally:

1.MON=1 OSD=3 ./vstart.sh -l -d -n mon osd -o 'osd_map_cache_size = 20' -o 'mon_min_osdmap_epochs = 30' -o 'paxos_service_trim_min=20' # these options are 200, 500 and 250 by default, the larger these number are, the more osdmaps are kept in cache and objectstore.
2. start three consoles:

  • while true; do ./ceph osd pool create test-pool 1 1;./ceph osd pool delete test-pool test-pool --yes-i-really-really-mean-it; done # keep updating the osdmap
  • watch -n1 bash -c 'find osd1/current/meta/ | wc -l' # the osdmap are stored in meta collection using objectstore. 1 (incremental) osdmap per object. so we can simply count the number of files in that directory. it increases when more osdmap are stored, but it drops when we are trimming the osdmap in monitor, and hence evicting the osdmap cache in OSD.
  • while true; do ./ceph report|grep osdmap_; sleep 3; done # once "osdmap_first_committed" changes, the monitor will trim its osdmap epochs, and OSD should remove them from its objectstore

@tchaikov
Copy link
Contributor Author

the fix is wrong. we send NullEvt when a new osdmap is consumed, which also updates all PGs in the pg_map, if a PG's persisted osd map is too stale (osd_pg_epoch_persisted_max_stale), the PG::write_if_dirty() updates the last_persisted_osdmap_ref with the latest osdmap set by PG::handle_advance_map.

but it could still serve as a cleanup.

@tchaikov tchaikov closed this Dec 22, 2015
@tchaikov tchaikov reopened this Dec 22, 2015
@tchaikov tchaikov added cleanup and removed bug-fix labels Dec 22, 2015
do not hold a strong reference to last_persisted_osdmap in PG. as an
OSD tries to trim previously persisted osdmaps, if they are not
referenced anymore. this helps to keep the meta collection in a
manageable size. if we advance the osdmap many times, and some PGs
are not impacted by these changes, it's very likely that they are
still holding very old osdmap references in their `last_persisted_osdmap_ref`.
this practically prevents the OSD from removing the out-dated osdmap
in OSD::handle_osd_map() if `last_persisted_osdmap_ref` is not updated
in a timely manner, for example, due to a large
"osd_pg_epoch_persisted_max_stale". so, instead of holding a reference
of last_persisted_osdmap, we can simply remember its epoch.

See also: ceph#13990

Signed-off-by: Kefu Chai <kchai@redhat.com>
@liewegas
Copy link
Member

lgtm

liewegas added a commit that referenced this pull request Jan 3, 2016
osd: do not keep ref of old osdmap in pg

Reviewed-by: Sage Weil <sage@redhat.com>
@liewegas liewegas merged commit d5ad57c into ceph:master Jan 3, 2016
@tchaikov tchaikov deleted the wip-13990 branch January 5, 2016 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants