stage0: added ability to resume interrupted downloads #1444
Conversation
I don't think quay.io supports Range requests, so to see this in action you'll need to pull an aci from somewhere that does. I've currently hosting one via nginx (will need |
This is to (partially) address #1280. |
What if the file at the URL changes between download start and resume? |
Not completely sure until I'm back at my desk on Monday, but I imagine that it'd either fail verification, or if that's been turned off it would (I hope) no longer be a valid tar file and thus rkt would fail to run it. If that's the case, perhaps I should add logic to tell when one of these failures happens from an image that was downloaded in multiple parts, and then retry the download? |
Can't we treat the file being resumed from as a cached partial copy of the remote resource and use the last modified or etag header to invalidate the cached copy? |
I imagine so, I'll go implement that. Those headers could be not supported though, right? Think the behavior in that case should be to just redownload the whole thing? |
It seems reasonable to me that we'd emit an explanation and re-download the whole thing if correctly (in)validating the cache isn't possible. Should probably get consensus from the rest of the rkt team on that decision. |
SGTM |
1 similar comment
SGTM |
@dgonyeo where are we on this? |
I've been distracted with acbuild, I'll give this some attention today.
|
b44a56c
to
d41cd2d
Compare
@dgonyeo is this ready? |
Nope, but I made a little more progress today. I needed to come up with a way to store a download's etag before it finishes downloading, in case of rkt being killed during the download. I should have it done tomorrow. |
d41cd2d
to
390e1e4
Compare
And by "tomorrow" I meant "in a half hour". Should be ready for another pass again. |
@@ -226,6 +226,21 @@ func (s Store) TmpFile() (*os.File, error) { | |||
return ioutil.TempFile(dir, "") | |||
} | |||
|
|||
// TmpNamedFile returns an *os.File with the specified name local to the same | |||
// filesystem as the Store, or any error encountered. If the file already | |||
// exists it will return the existing file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please mention it'll be in append mode
390e1e4
to
3d66af1
Compare
Do you think you'll have time to add some tests? |
Whoops, completely forgot about this. Sorry about that. I'll write some tests. |
8cce34c
to
fc9a2c4
Compare
I added a test. It starts up a little server hosting an image, has rkt fetch the image, kills it when it reaches 50%, has rkt fetch it again, and ensures that rkt downloads only a part of the image the second time. Think any other tests would be warranted, or is this enough? |
@dgonyeo I think testing an interrupted download on a server which does not support ranges or etags would be good. |
df3a226
to
2e20ba2
Compare
I added two more tests: one for a server that doesn't support last-modified or etags, and one that sets the last-modified header to one minute in the future. |
rangeHeader := rangeHeaders[0][6:] // The first (and only) range header, with len("bytes=") characters chopped off the front | ||
tokens := strings.Split(rangeHeader, "-") | ||
if len(tokens) != 2 { | ||
t.Fatalf("number of '-'s in range header was not 1, it was %d", len(tokens)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be: len(tokens) - 1
or just: t.Fatalf("couldn't parse rangeHeader: %q", rangeHeader)
0d23ea9
to
b24cd35
Compare
@@ -636,6 +713,8 @@ func (f *fetcher) downloadHTTP(url, label string, out writeSyncer, etag string) | |||
return nil, fmt.Errorf("bad HTTP status code: %d", res.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error is http.StatusRequestedRangeNotSatisfiable
(416), then there is a bug in the cache management (in rkt or in the server). Should we delete the partially downloaded file and the etag file in order to recover from that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, but I'm curious if this is an edge case that will ever be hit, since at this point we already verified that range requests were supported via the HEAD request.
Could this happen if the image size changed since the first part was cached, resulting in the requested starting point being larger than the size of the new image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the logic for this. If http.StatusRequestedRangeNotSatisfiable
is the HTTP code, the out writeSyncer is a file, and the file has a non-zero size, then we seek to the beginning of the file, truncate it to 0 bytes, and retry the download. Otherwise we fallthrough
to the error case.
b24cd35
to
9feb308
Compare
// rkt tries to use the cache, the hash won't check out. | ||
server2 := httptest.NewServer(testSimpleServerHandler(t, imagePath)) | ||
defer server2.Close() | ||
cmd = fmt.Sprintf("%s --no-store --insecure-skip-verify fetch %s", ctx.Cmd(), server2.URL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server.URL
might not be the same as server2.URL
, so rkt might not even try to use the cache.
Downloads for a given url are now saved to a file named after a sha512 hash of the url. If the file already exists for a requested download, a check will be made to see if the server supports HTTP range requests. If such requests are supported then only the missing part of the file will be downloaded, otherwise the entire file will be downloaded as before.
9feb308
to
d8b1fdb
Compare
Weird, |
The error is:
That's an internal server error. We get them sometimes, I'll restart the test. |
LGTM |
stage0: added ability to resume interrupted downloads
Downloads for a given url are now saved to a file named after a sha512
hash of the url. If the file already exists for a requested download, a
check will be made to see if the server supports HTTP range requests. If
such requests are supported then only the missing part of the file will
be downloaded, otherwise the entire file will be downloaded as before.