Skip to content
This repository has been archived by the owner on Feb 24, 2020. It is now read-only.

GC: always detect and unmount leftovers #1856

Merged
merged 1 commit into from Dec 17, 2015
Merged

GC: always detect and unmount leftovers #1856

merged 1 commit into from Dec 17, 2015

Conversation

steveej
Copy link
Contributor

@steveej steveej commented Dec 9, 2015

This covers unmounting of recursive bind-mounts, as well as the overlay mounts.
It also has the advantage of not attempting to unmount something that is
not there, e.g. if a reboot happened and the mounts are gone anyway.

stage1TreeStoreID, err := p.getStage1TreeStoreID()
if err != nil {
stderr("error getting stage1 treeStoreID: %v", err)
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to resist this error too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if we can't get the treeStoreID we should still be able to force-GC the pod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. PR updated

@alban alban added this to the v0.14.0 milestone Dec 10, 2015
}

// unmount all leftover mounts
stdout("GC of leftover mounts for pod %q...", p.uuid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to output this every time? Just an implementation detail, IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this stdout, I think the previous stdout in emptyExitedGarbage() was enough.

@alban
Copy link
Member

alban commented Dec 11, 2015

I would like to validate that the shared mounts and then make-rprivate+umount works in practice when there are several pods.

  • Ideally, when the number of "fly" pods running increase, the number of mounts on the host namespace increases linearly and not exponentially. I would like that to be tested by starting e.g. ~10 "fly" pods.
  • When starting "fly" pod1 and then "fly" pod2, I would like to check that they can be GC'ed in any order.

@alban
Copy link
Member

alban commented Dec 11, 2015

The reason I am asking these two tests is because I think the number of mounts will increase exponentially with the number of pods (or Fibonacci? I am not sure):

  • zero pods
/ (root)
└─ /data
  • one pod
/ (root)
├─ /data
└─ pod1
   └─ host-root
      └─ data
  • two pods
/ (root)
├─ /data
├─ pod1
│  └─ host-root
│     └─ data
└─ pod2
   └─ host-root
      ├─ data
      └─ pod1
         └─ host-root
            └─ data
  • three pods: too big tree to draw.

If it is indeed the case, then I think we should check @philips ' suggestion with rkt run-direct: without chroot and without bind mounts, but just a chdir.

@alban
Copy link
Member

alban commented Dec 11, 2015

The number of mount points seems to grow quickly, from my test script:

$ sudo ./try.sh
Pod 1: 37 mounts
Pod 2: 185 mounts
Pod 3: 1517 mounts
Pod 4: 66785 mounts

@steveej
Copy link
Contributor Author

steveej commented Dec 11, 2015

@alban The user provided volumes are mounted with just MS_BIND so they won't recurse submounts. The main problem is that if you mount / into the container recursively the mounts will be multiplied explosively due to directories that again like /var/lib/rkt which again contain the recursively mounted /.

This is not only a problem of scaling but also of containers blocking each other in GC. I've solved this problem by unmounting all instances of a pod's rootfs, also if they live in a different pod's rootfs due to recursive mounts. I've left this code in even though we don't use recursive binds now, since we might provide the option to configure this per mount and will need it then.

stderr("Skipping stage1 GC")
} else {
stage1RootFS := s.GetTreeStoreRootFS(stage1TreeStoreID)
if err := stage0.GC(p.path(), p.uuid, stage1RootFS, globalFlags.Debug); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err is a shadowed variable. When you enable go vet --shadow, it will compliant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Should we enable the --shadow flag in our testscript?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know.

I thought that shadowing with if err := ...; err != nil { was idiomatic in Golang.

@alban
Copy link
Member

alban commented Dec 14, 2015

I've left this code in even though we don't use recursive binds now, since we might provide the option to configure this per mount and will need it then.

It would be better to add that code when we actually need it because we are not sure it's going to be added.

}

// unmount all leftover mounts
if err := stage0.ForceMountGC(p.path(), p.uuid.String()); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: rename function to stage0.MountGC().

}

if strings.HasPrefix(mountPoint, path) {
for i, s := range lineResult {
s := s[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use strings.Split, no need to do this s := s[0].

@alban
Copy link
Member

alban commented Dec 14, 2015

Since it does not use recursive binds, this branch could be simplified: no need to complicate the parsing of /proc/pid/mountinfo for now to parse the optional fields and we can keep the fmt.Sscanf as it is.

@steveej
Copy link
Contributor Author

steveej commented Dec 16, 2015

@alban

Since it does not use recursive binds, this branch could be simplified: no need to complicate the parsing of /proc/pid/mountinfo for now to parse the optional fields and we can keep the fmt.Sscanf as it is.

Assuming it is the stage1 that may still construct recursive bind mounts which we need to detect properly to be able to clean it up. WRT the stage1 fly this is needed for the recursive mounts for /dev, /sys and /proc. Also, users might still instruct recursive mounts manually, since passing MS_REC has the same results just a more comfortable invocation ;-)

mode string
)

// lineRegex := regexp.MustCompile(`[^ ]+`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If strings.Split works, you can remove this comment :)

@alban
Copy link
Member

alban commented Dec 17, 2015

A rebase would be good for the next iteration :)

This is for unmounting recursive bind-mounts, as well as the overlay mounts.
It also has the advantage of not attempting to unmount something that is
not there, e.g. if a reboot happened and the mounts are gone anyway.
@alban
Copy link
Member

alban commented Dec 17, 2015

I think it looks good! @krnowak are you still reviewing it or is it good to you too?

@krnowak
Copy link
Collaborator

krnowak commented Dec 17, 2015

No no, I'm done with it. LFAD if green.

alban added a commit that referenced this pull request Dec 17, 2015
GC: always detect and unmount leftovers
@alban alban merged commit d8a98bb into rkt:master Dec 17, 2015
@alban
Copy link
Member

alban commented Dec 17, 2015

👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants