Bug #12297: ceph-fuse 0.94.2-1trusty segfaults / aborts - CephFS - Ceph

Actions

Copy link

Bug #12297

closed

ceph-fuse 0.94.2-1trusty segfaults / aborts

Added by Burkhard Linke almost 9 years ago. Updated over 8 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

hammer

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v0.94.2

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Running CephFS for data volumes on a compute cluster.

ceph-fuse aborts and leaves the mount point unusable (transport endpoint not connected).

Except of client log incl. 20 latest ops:

   -20> 2015-07-13 12:08:13.905733 7f60be7fc700 20 client.412413 trim_caps counting as trimmed: 100015d4d13.head(ref=2 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.875809 caps=pAsLsXsFscr(0=pAsLsXsFscr) 0x7f60ac8c6320)
   -19> 2015-07-13 12:08:13.905744 7f60be7fc700 10 client.412413 put_inode on 100015d4d13.head(ref=2 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.875809 caps=pAsLsXsFscr(0=pAsLsXsFscr) 0x7f60ac8c6320)
   -18> 2015-07-13 12:08:13.905755 7f60be7fc700 20 client.412413  trying to trim dentries for 100015d4d14.head(ref=2 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.893082 caps=pAsLsXsFscr(0=pAsLsXsFscr) parents=0x7f5fb43408e0 0x7f60acc44830)
   -17> 2015-07-13 12:08:13.905767 7f60be7fc700 15 client.412413 trim_dentry unlinking dn raw_reads.136.raw_reads.296.N2.las in dir 100012fa7e7
   -16> 2015-07-13 12:08:13.905770 7f60be7fc700 15 client.412413 unlink dir 0x7f60ac59d5a0 'raw_reads.136.raw_reads.296.N2.las' dn 0x7f5fb43408e0 inode 0x7f60acc44830
   -15> 2015-07-13 12:08:13.905773 7f60be7fc700 20 client.412413 unlink  inode 0x7f60acc44830 parents now 
   -14> 2015-07-13 12:08:13.905774 7f60be7fc700 10 client.412413 put_inode on 100015d4d14.head(ref=3 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.893082 caps=pAsLsXsFscr(0=pAsLsXsFscr) 0x7f60acc44830)
   -13> 2015-07-13 12:08:13.905785 7f60be7fc700 15 client.412413 unlink  removing 'raw_reads.136.raw_reads.296.N2.las' dn 0x7f5fb43408e0
   -12> 2015-07-13 12:08:13.905789 7f60be7fc700 20 client.412413 trim_caps counting as trimmed: 100015d4d14.head(ref=2 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.893082 caps=pAsLsXsFscr(0=pAsLsXsFscr) 0x7f60acc44830)
   -11> 2015-07-13 12:08:13.905801 7f60be7fc700 10 client.412413 put_inode on 100015d4d14.head(ref=2 ll_ref=1 cap_refs={} open={} mode=120777 size=74/0 mtime=2015-07-13 12:08:05.893082 caps=pAsLsXsFscr(0=pAsLsXsFscr) 0x7f60acc44830)
   -10> 2015-07-13 12:08:13.905840 7f60a3fff700 10 client.412413 _async_dentry_invalidate 'ch' ino 0 in dir 100009de0f2.head
    -9> 2015-07-13 12:08:13.905849 7f606bfff700 15 client.412413  de raw_reads.124.raw_reads.296.C2.las off 6170202972637346 = 0
    -8> 2015-07-13 12:08:13.906701 7f60a06f7700  2 -- 192.168.2.22:0/20333 >> 192.168.6.5:6810/4247 pipe(0x7f607c030cf0 sd=2 :37085 s=2 pgs=28199 cs=1 l=1 c=0x7f607c034f90).reader couldn't read tag, (11) Resource temporarily unavailable
    -7> 2015-07-13 12:08:13.906738 7f60a06f7700  2 -- 192.168.2.22:0/20333 >> 192.168.6.5:6810/4247 pipe(0x7f607c030cf0 sd=2 :37085 s=2 pgs=28199 cs=1 l=1 c=0x7f607c034f90).fault (11) Resource temporarily unavailable
    -6> 2015-07-13 12:08:13.906808 7f60be7fc700  1 client.412413.objecter ms_handle_reset on osd.61
    -5> 2015-07-13 12:08:13.906818 7f60be7fc700  1 -- 192.168.2.22:0/20333 mark_down 0x7f607c034f90 -- pipe dne
    -4> 2015-07-13 12:08:13.907127 7f60be7fc700 10 monclient: renew_subs
    -3> 2015-07-13 12:08:13.907135 7f60be7fc700 10 monclient: _send_mon_message to mon.mon-i1 at 192.168.6.50:6789/0
    -2> 2015-07-13 12:08:13.907141 7f60be7fc700  1 -- 192.168.2.22:0/20333 --> 192.168.6.50:6789/0 -- mon_subscribe({mdsmap=4312+,monmap=14+,osdmap=175910}) v2 -- ?+0 0x7f60ac68f8b0 con 0x3b50160
    -1> 2015-07-13 12:08:13.907279 7f606bfff700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f606bfff700

 ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
 1: ceph-fuse() [0x6235ca]
 2: (()+0x10340) [0x7f60c9dd8340]
 3: (std::string::assign(std::string const&)+0x1c) [0x7f60c8ff748c]
 4: (Client::_readdir_cache_cb(dir_result_t*, int (*)(void*, dirent*, stat*, int, long), void*)+0x39a) [0x55263a]
 5: (Client::readdir_r_cb(dir_result_t*, int (*)(void*, dirent*, stat*, int, long), void*)+0xfc5) [0x596045]
 6: ceph-fuse() [0x546d2d]
 7: (()+0x13e76) [0x7f60ca20fe76]
 8: (()+0x1522b) [0x7f60ca21122b]
 9: (()+0x11e49) [0x7f60ca20de49]
 10: (()+0x8182) [0x7f60c9dd0182]
 11: (clone()+0x6d) [0x7f60c875547d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

     0> 2015-07-13 12:08:13.908290 7f60be7fc700  1 -- 192.168.2.22:0/20333 <== mon.3 192.168.6.50:6789/0 266 ==== osd_map(175910..175910 src has 172194..175910) v3 ==== 222+0+0 (8575416 0 0) 0x7f60b00008c0 con 0x3b50160

The OSD mentioned in OP 8 is up and running, but might take some time to respond due to current backfilling operations. The compute jobs are setup to write output to the same file, so several cephfs clients are trying to open a single file for writing.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Greg Farnum almost 9 years ago

Project changed from Ceph to CephFS
Category set to 45

Do you have a core file from this that you can look at with gdb, and the ceph debug packages installed? (You can pull them down if they aren't already present.)

If you can look at it and just paste the output of "backtrace" here that would be very helpful; as it is I'm not sure which of the string users are broken.

Actions

Copy link

Updated by Burkhard Linke almost 9 years ago

Sorry, no core file available. I have a number of hosts with CephFS running without any problems; some other hosts show the described behaviour.

Since some of the compute jobs are still pending I may reactivate one of the nodes in the cluster and hope that it fails again (not 100% reproducible..).

How to I manipulate the limits for fuse mounts? Is setting the limit prior to mounting enough?

Actions

Copy link

Updated by Greg Farnum almost 9 years ago

You mean the ulimit for dumping the core file? That will depend on how you're invoking it; if you're just running "ceph-fuse ..." then you can set "ulimit -c unlimited" in the shell before doing so and that will dump the core, yes.

Actions

Copy link

Updated by Burkhard Linke almost 9 years ago

(gdb) bt
#0 0x00007f70d7bf020b in raise (sig=7)
at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x000000000062368d in reraise_fatal (signum=7)
at global/signal_handler.cc:59
#2 handle_fatal_signal (signum=7) at global/signal_handler.cc:109
#3 <signal handler called>
#4 Client::_readdir_cache_cb (this=this@entry=0x38a8ab0,
dirp=dirp@entry=0x7f6f7c018400, cb=cb@entry=
0x547820 <fuse_ll_add_dirent(void*, dirent*, stat*, int, off_t)>,
p=p@entry=0x7f707d7cac20) at client/Client.cc:6048
#5 0x0000000000596045 in Client::readdir_r_cb (this=0x38a8ab0,
d=d@entry=0x7f6f7c018400,
cb=cb@entry=0x547820 <fuse_ll_add_dirent(void*, dirent*, stat*, int, off_t)>, p=p@entry=0x7f707d7cac20) at client/Client.cc:6171
#6 0x0000000000546d2d in fuse_ll_readdir (req=0x7f6f9c004ce0,
ino=1099533194016, size=4096, off=129, fi=<optimized out>)
at client/fuse_ll.cc:582
#7 0x00007f70d8027e76 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#8 0x00007f70d802922b in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#9 0x00007f70d8025e49 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#10 0x00007f70d7be8182 in start_thread (arg=0x7f707d7cb700)
at pthread_create.c:312
#11 0x00007f70d656d47d in clone ()

Actions

Copy link

Updated by Greg Farnum almost 9 years ago

That's pretty bizarre. Signal 7 is SIGBUS "Bus error (bad memory access)", which is separate from SIGSEGV. And given the surrounding code that's pretty well guarded against. In GDB again can you print the value of "dn" and (if it's valid) dn->inode?
[run gdb]
f 4
p dn
p dn->inode

Or you can join us on irc, oftc.net #ceph-devel and ping gregsfortytwo. Thanks!

Actions

Copy link

Updated by Burkhard Linke almost 9 years ago

(gdb) f 4
#4 Client::_readdir_cache_cb (this=this@entry=0x38a8ab0, dirp=dirp@entry=0x7f6f7c018400, cb=cb@entry=
0x547820 <fuse_ll_add_dirent(void*, dirent*, stat*, int, off_t)>, p=p@entry=0x7f707d7cac20) at client/Client.cc:6048
6048 client/Client.cc: No such file or directory.
(gdb) p dn
$1 = (Dentry *) 0x342e746e65696c63
(gdb) p dn->inode
Cannot access memory at address 0x342e746e65696c9b

Actions

Copy link

Updated by Greg Farnum almost 9 years ago

Okay, I think I see what happened, one more please.

[run gdb]
f 4
p dir
p dir->dentry_list

And if you could take the log for the ceph-fuse instance you're looking at now and upload it with ceph-post-file that is also likely to be helpful.

Actions

Copy link

Updated by Burkhard Linke almost 9 years ago

(gdb) p dir
$2 = (Dir *) 0x7f709000f9a0
(gdb) p dir->dentry_list
$3 = {_front = 0x7f6f8002a260, _back = 0x7f70b83aaa60, _size = 9}

The log file has a size of 13 GB and the host has no direct connection to the internet. I can transfer it to another host and upload it, but given its size I would prefer to only upload a part.

Actions

Copy link

Updated by Greg Farnum almost 9 years ago

Status changed from New to 12

Zheng, can you take a look at this and see if you need the log or can identify the issue? I presume something's gone wrong in advancing to the at_cache_name dentry, but I'm not quite sure how.

Actions

Copy link

#10

Updated by Greg Farnum almost 9 years ago

Priority changed from Normal to Urgent

Actions

Copy link

#11

Updated by Zheng Yan almost 9 years ago

I can't figure out what's wrong neither

Actions

Copy link

#12

Updated by Greg Farnum almost 9 years ago

We've got logs and core file via ceph-post-file

c00fb309-e2a1-482c-b093-1733affb2901 gzipped core dump
60fbd6a7-ffde-4cf0-988b-a80737a1d1ec ceph-fuse log file
distribution is ubuntu 14.04
ceph-fuse --version: ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)

Actions

Copy link

#13

Updated by Greg Farnum almost 9 years ago

Okay, I dug into this today and I think our readdir is a bit broken (obviously). In particular, we're dropping the client lock while invoking the callback, and I can see in the log that one of the times we dropped the lock a whole bunch of dentries got unlinked. So then we get back the lock and try to look at the next dentry in our xlist iterator (which was advanced to a valid dentry while we still had the lock), but the dentry we're trying to look at has (it appears) been freed!

I'm not sure how this is supposed to work; investigating now.

Actions

Copy link

#14

Updated by Greg Farnum almost 9 years ago

Status changed from 12 to Fix Under Review

I think I have a fix for this issue but I want to get a few eyes on it before handing it off for user testing.

Actions

Copy link

#15

Updated by Greg Farnum almost 9 years ago

Burkhard, if you can test out the hammer-12297 branch on one of the boxes seeing this bug and find out if that fixes it, that would be helpful. I've not yet figured out a good way to reproduce the issue.

Actions

Copy link

#16