Conversation
if err != nil { | ||
stderr("Error retrieving app hashes from pod manifest: %v", err) | ||
return | ||
if err := cleanExitedGarbage(p); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this seems right. This goes straight to the force unmounting irrespective of the failure condition, e.g. if https://github.com/coreos/rkt/pull/1828/files#diff-204288dd8b5d5205881366c76eb4f97eR184 fails initialising the store. Might that not cause pods to get in a weird half-GCed state? Should we be more nuanced at distinguishing actual failure conditions here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the example you've given and have moved that out of the checkExitGarbage function. However, I don't think there is a need to get more fancy with the rest of the failures. If the next one we get to fails (treestoreid), or any thereafter, this means the pod directory is in a weird state. IMHO we should just bail and force clean at that point.
aaa721f
to
263fd14
Compare
|
||
for _, mnt := range mnts { | ||
if err := syscall.Unmount(mnt.mountPoint, 0); err != nil { | ||
return fmt.Errorf("couldn't unmounting at %v: %v", mnt.mountPoint, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't unmount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop the contraction in favor of could not
, but won't insist on it.
@@ -8,3 +8,4 @@ | |||
/makelib/variables.mk | |||
/build-rkt* | |||
.vagrant | |||
tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this file is only in the toplevel directory of the repository, then prepend it with /
. If not, then ignore the comment.
263fd14
to
d0f71d2
Compare
@@ -179,6 +179,49 @@ func emptyGarbage() error { | |||
return nil | |||
} | |||
|
|||
// cleanExitedGarbage prepares the pod's directory for removal. If anything | |||
// fails here then we can conclude that the rkt directory is in a bad state and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the pod's directory, not rkt, right?
d0f71d2
to
99da5be
Compare
|
||
// execute stage1's GC | ||
if err := stage0.GC(p.path(), p.uuid, stage1RootFS, globalFlags.Debug); err != nil { | ||
return fmt.Errorf("problem performing stage1 GC on %v, attempting brute-force GC: %v", p.uuid, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't use this error string. Maybe print that instead of https://github.com/coreos/rkt/pull/1828/files#diff-204288dd8b5d5205881366c76eb4f97eR243?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Well, if we are catching this I'd rather use something like debug output to say that we caught this and are forcing the GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this
stderr("Recovering from the following GC error: %v", err)
stderr("Performing forced GC on pod %q", p.uuid)
99da5be
to
1b0a8f8
Compare
I use gotags and don't wish to be reminded with each git status.
When a pod directory is in an unexpected state, it may happen that the standard garbage collection fails. When this happens mounts can be left around, requiring manual removal. This change forces a GC of the pod mount points when the pod GC otherwise fails. To do this, the /proc/self/mountinfo file is parsed, the pod mount points are extracted, ordered, then unmounted. fixes rkt#1572
1b0a8f8
to
496687c
Compare
LGTM |
stage0: brute-force GC on GC failure
When a pod directory is in an unexpected state, it may happen that the
standard garbage collection fails. When this happens mounts can be left
around, requiring manual removal.
This change forces a GC of the pod mount points when the pod GC
otherwise fails. To do this, the /proc/self/mountinfo file is parsed,
the pod mount points are extracted, ordered, then unmounted.
fixes #1572