New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon: modify a dout level in OSDMonitor.cc #6928
Conversation
I'd suggest setting it at level 1 instead of 0 and adjusting your configuration options to 'debug mon = 1'. Otherwise this will spam log files for other users of ceph who may not be as interested in this debug info. |
@jecluis i think adjust the level is necessary,and your suggestion is justified. it is a compromising way . |
Please squash the two commits in this pull request so that it can be merged. Thanks! |
cb22b77
to
8de6d1b
Compare
@jecluis I have done it. |
@HeyoQiang Just one last nit: the commit message. Drop the 'before' and 'after' code and simply explain why the change is being introduced. There's no point in having the commit message polluted with info we can easily get out of git. |
@HeyoQiang While you are dealing with my last comment, please accept my apologies when I ask you to instead of using debug level '1' use instead something higher, like '2'. The 'debug mon' value is '1' and the same reasoning I used before applies. My bad for not noticing this earlier. Sorry! |
@jecluis In actual use,we need down osds manually,if we want to replace old hard disk by new or do performance testing.the parameter and the log can helps us how many osds we can down. |
@HeyoQiang The logs are not the way to figure out how many osds you can have down, or mark down. That's info you should be able to get via the cli; if you don't, you are welcome to contribute a patch to help you do that. Unless I'm misunderstanding something? |
8de6d1b
to
795f220
Compare
@jecluis , I am not that understand, in which case an operator like to remain at least some percentage of OSDs UP even though these OSDs is misbehaving and should be mark down?(i.e already assert, network isolation, etc.....). In this case I think we failed to respond to any failure and may lead to data loss. Imaging a case, if we gracefully shutdown 2/3 of the OSDs, in this case the cluster should in HEALTH_OK status( out and down one OSD, wait for recovery done, and down another). Then say we lost one more nodes, but the OSDs in this node will not be mark down in this case, so no peer/recovery will be triggered. |
@jecluis If we mark one OSD to down when OSD-curretn-up-ratio lower than OSD-min-up-ratio , cli will point the OSD is marked down successfully ,but in actually,the OSD is still up. in this situation, we can only know the reason why it happend by the log i have said before. |
@HeyoQiang ok, gotcha. Please update the commit message to something reasonable. It's a bit weird right now and I'd rather not merge it as it is. Debugging at level 2 is fine. |
795f220
to
67104f6
Compare
@HeyoQiang could you please add a Signed-off-by: Name Surname <mail.com> line at the end of the commit message ? While you're at it, please make a pass of spell check and make sure the lines are 80 columns each, max. For the benefit of future readers :-) |
67104f6
to
b9d5fad
Compare
@dachary I have done it. Thanks! |
b9d5fad
to
d4cf79b
Compare
Monitor: modify a dout level in OSDMonitor.cc Reviewed-by: Joao Eduardo Luis <joao@suse.de> Reviewed-by: Loic Dachary <ldachary@redhat.com>
In actual use, we replace old hard disk by new or do performance testing regularly and need to down OSDs manually, but there are some OSDs can not be marked down, and there is no obvious information. The log { dout(5) << "can_mark_down current up_ratio " << up_ratio << " < min "<< g_conf->mon_osd_min_up_ratio<< ", will not mark osd." << i << "down" << dendl ; }" can explain why it happened. In addition, we can change the value of mon_osd_min_up_ratio more reasonable in our operating environment.so it is necessary to adjust the log's level lower. For example: There are 6 OSDs , wen have marked down 5 of them and the mon_osd_min_up_ratio = 0.3. In this situation, when we mark the last OSD to down, it will show "ceph-osd stop/waiting", but in actually, the OSD is still up. Signed-off-by: Yongqiang He <he.yongqiang@h3c.com>
The commit message is, however, not hard-wrapped at 80 columns. This makes it harder to properly read in git log. Please be so kind as to update it. :) |
Oh well, nevermind. Noticed way too late it has been merged. |
This modification changed a log level from 5 to 0
Before:/* dout(5) << "can_mark_down current up_ratio " << up_ratio << " < min "
<< g_conf->mon_osd_min_up_ratio
<< ", will not mark osd." << i << " down" << dendl; */
After:/* dout(0) << "can_mark_down current up_ratio " << up_ratio << " < min "
<< g_conf->mon_osd_min_up_ratio
<< ", will not mark osd." << i << " down" << dendl; */
we think this log is very important to help us find the reason why the markdown request of osd be rejected by monitor,
and it helps us to know there is a parameter "mon_osd_min_up_ratio";
Signed-off-by: Yongqiang He he.yongqiang@h3c.com