zsync2
zsync2 copied to clipboard
zsync2 does not support downloading large files.
zsync2 does not support downloading large files.
failed to parse content-range headerError while parsing headersOther error? -1
I patched zsync2 so it shows to and from values:
$ git diff -U10
diff --git a/src/legacy_http.c b/src/legacy_http.c
index 41310da..290603e 100644
--- a/src/legacy_http.c
+++ b/src/legacy_http.c
@@ -626,20 +626,21 @@ int range_fetch_read_http_headers(struct range_fetch *rf) {
p[len] = 0;
}
/* buf is the header name (lower-cased), p the value */
/* Switch based on header */
if (status == 206 && !strcmp(buf, "content-range")) {
/* Okay, we're getting a non-MIME block from the remote. Get the
* range and set our state appropriately */
int from, to;
sscanf(p, "bytes " OFF_T_PF "-" OFF_T_PF "/", &from, &to);
+ fprintf(stderr, "content-range from: %d to: %d\n", from, to);
if (from <= to) {
rf->block_left = to + 1 - from;
rf->offset = from;
} else {
fprintf(stderr, "failed to parse content-range header");
}
/* Can only have got one range. */
rf->rangesdone++;
rf->rangessent = rf->rangesdone;
$ ./zsync2 -v https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather.zsync
zsync2 version 2.0.0-alpha-1 (commit 7857ff1), build <local dev build> built on 2018-12-21 09:58:27 UTC
Checking for changes...
Cannot find file hg19-regions-9species.all_regions.mc8nr.feather, triggering full download
/ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather.part found, using as seed file
Target file: /ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather
Reading seed file: /ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather.part
Usable data from seed files: 0.000000%
Renaming temp file
Fetching remaining blocks
Downloading from https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather
-------------------- 0.0%* Hostname was NOT found in DNS cache
* Trying 134.58.50.8...
* Adding handle: conn: 0x1654a00
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 3 (0x1654a00) send_pipe: 1, recv_pipe: 0
* Connected to resources.aertslab.org (134.58.50.8) port 443 (#3)
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* SSL connection using ECDHE-RSA-AES256-GCM-SHA384
* Server certificate:
* subject: CN=resources.aertslab.org
* start date: 2018-11-25 04:49:48 GMT
* expire date: 2019-02-23 04:49:48 GMT
* subjectAltName: resources.aertslab.org matched
* issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
* SSL certificate verify ok.
> GET /cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather HTTP/1.1
Range: bytes=0-3475369983
Host: resources.aertslab.org
Accept: */*
< HTTP/1.1 206 Partial Content
< Date: Fri, 21 Dec 2018 10:09:34 GMT
* Server Apache/2.4.37 (Ubuntu) is not blacklisted
< Server: Apache/2.4.37 (Ubuntu)
< Strict-Transport-Security: max-age=15768000
< Last-Modified: Wed, 23 May 2018 07:38:22 GMT
< ETag: "16cf25e760-56cda9e8f304e"
< Accept-Ranges: bytes
< Content-Length: 3475369984
< Content-Range: bytes 0-3475369983/97964648288
<
content-range from: 0 to: -819597313
failed to parse content-range headerError while parsing headersOther error? -1
-1 returned
-------------------- 0.0% 0.0 kBps aborted
* Closing connection 3
failed to retrieve from hg19-regions-9species.all_regions.mc8nr.feather, status -1
As you can see int (signed int) is not big enough, from and to should be uint (unsigned int) (at least 32 bits).
Thanks @ghuls. Could you please send a PR that includes
- The added verbosity
- Using
uint
Again, thank you very much.
@probonopd Adding this change is not enough.
It seems that there are a lot of issues with files bigger than 2GiB:
$ ./zsync2 -v https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather.zsync
zsync2 version 2.0.0-alpha-1 (commit 7857ff1), build <local dev build> built on 2018-12-21 12:30:07 UTC
Checking for changes...
Cannot find file hg19-regions-9species.all_regions.mc8nr.feather, triggering full download
/ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather.part found, using as seed file
Target file: /ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather
Reading seed file: /ddn1/vol1/site_scratch/leuven/303/vsc30366/hg19-regions-9species.all_regions.mc8nr.feather.part
Usable data from seed files: 3.547576%
Renaming temp file
Fetching remaining blocks
Downloading from https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather
-------------------- 3.5%* Hostname was NOT found in DNS cache
* Trying 134.58.50.8...
* Adding handle: conn: 0x16f74f0
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 3 (0x16f74f0) send_pipe: 1, recv_pipe: 0
* Connected to resources.aertslab.org (134.58.50.8) port 443 (#3)
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* SSL connection using ECDHE-RSA-AES256-GCM-SHA384
* Server certificate:
* subject: CN=resources.aertslab.org
* start date: 2018-11-25 04:49:48 GMT
* expire date: 2019-02-23 04:49:48 GMT
* subjectAltName: resources.aertslab.org matched
* issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
* SSL certificate verify ok.
> GET /cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/hg19-regions-9species.all_regions.mc8nr.feather HTTP/1.1
Range: bytes=3475369984-3475369983
Host: resources.aertslab.org
Accept: */*
< HTTP/1.1 200 OK
< Date: Fri, 21 Dec 2018 12:36:54 GMT
* Server Apache/2.4.37 (Ubuntu) is not blacklisted
< Server: Apache/2.4.37 (Ubuntu)
< Strict-Transport-Security: max-age=15768000
< Last-Modified: Wed, 23 May 2018 07:38:22 GMT
< ETag: "16cf25e760-56cda9e8f304e"
< Accept-Ranges: bytes
< Content-Length: 97964648288
<
zsync received a data response (code 200) but this is not a partial content response
zsync can only work with servers that support returning partial content from files. The person/entity creating this .zsync has tried to use a server that is not returning partial content. zsync cannot be used with this server.
See http://zsync.moria.orc.uk/server-issues
Other error? -1
-1 returned
-------------------- 3.5% 0.0 kBps aborted
* Closing connection 3
failed to retrieve from hg19-regions-9species.all_regions.mc8nr.feather, status -1
It seems to request an invalid byte range: Range: bytes=3475369984-3475369983
Tested with this changes:
$ git diff
diff --git a/src/legacy_http.c b/src/legacy_http.c
index 41310da..ccf4f06 100644
--- a/src/legacy_http.c
+++ b/src/legacy_http.c
@@ -53,8 +53,8 @@ struct http_file
} handle;
char *buffer;
- size_t buffer_len;
- size_t buffer_pos;
+ off_t buffer_len;
+ off_t buffer_pos;
int still_running;
};
@@ -391,9 +391,9 @@ static int fill_buffer(HTTP_FILE *file, size_t want, CURLM* multi_handle)
*
* Removes `want` bytes from the front of the buffer.
*/
-static int use_buffer(HTTP_FILE *file, int want)
+static off_t use_buffer(HTTP_FILE *file, off_t want)
{
- if((file->buffer_pos - want) <= 0){
+ if(file->buffer_pos <= want){
/* trash the buffer */
if(file->buffer){
free(file->buffer);
@@ -416,7 +416,7 @@ static int use_buffer(HTTP_FILE *file, int want)
*/
size_t http_fread(void *ptr, size_t size, size_t nmemb, HTTP_FILE *file, struct range_fetch *rf)
{
- size_t want;
+ off_t want;
want = nmemb * size;
fill_buffer(file, want, rf->multi_handle);
@@ -560,14 +560,14 @@ static void buflwr(char *s) {
int range_fetch_read_http_headers(struct range_fetch *rf) {
char buf[512];
int status;
- int seen_location = 0;
+ uint seen_location = 0;
{ /* read status line */
char *p;
if (rfgets(buf, sizeof(buf), rf) == NULL){
/* most likely unexpected EOF from server */
- fprintf(stderr, "EOF from server");
+ fprintf(stderr, "EOF from server\n");
return -1;
}
if (buf[0] == 0)
@@ -622,7 +622,7 @@ int range_fetch_read_http_headers(struct range_fetch *rf) {
p += 2;
buflwr(buf);
{ /* Remove the trailing \r\n from the value */
- int len = strcspn(p, "\r\n");
+ uint len = strcspn(p, "\r\n");
p[len] = 0;
}
/* buf is the header name (lower-cased), p the value */
@@ -631,13 +631,14 @@ int range_fetch_read_http_headers(struct range_fetch *rf) {
if (status == 206 && !strcmp(buf, "content-range")) {
/* Okay, we're getting a non-MIME block from the remote. Get the
* range and set our state appropriately */
- int from, to;
+ off_t from, to;
sscanf(p, "bytes " OFF_T_PF "-" OFF_T_PF "/", &from, &to);
+ fprintf(stderr, "content-range from: %d to: %d\n", from, to);
if (from <= to) {
rf->block_left = to + 1 - from;
rf->offset = from;
} else {
- fprintf(stderr, "failed to parse content-range header");
+ fprintf(stderr, "failed to parse content-range header\n");
}
/* Can only have got one range. */
@@ -678,7 +679,7 @@ int range_fetch_read_http_headers(struct range_fetch *rf) {
*/
}
- fprintf(stderr, "Error while parsing headers");
+ fprintf(stderr, "Error while parsing headers\n");
return -1;
}
diff --git a/src/zsclient.cpp b/src/zsclient.cpp
index 06a993b..c5fd3f0 100644
--- a/src/zsclient.cpp
+++ b/src/zsclient.cpp
@@ -269,12 +269,14 @@ namespace zsync2 {
// if interested in headers only, download 1 kiB chunks until end of zsync header is found
if (headersOnly) {
- static const auto chunkSize = 1024;
- unsigned long currentChunk = 0;
+issueStatusMessage("headersOnly");
+ static const off_t chunkSize = 1024;
+ off_t currentChunk = 0;
// download a chunk at a time
while (true) {
std::ostringstream bytes;
+issueStatusMessage("headersOnly:" + std::to_string(currentChunk) + " " + std::to_string( chunkSize) + " " + std::to_string(currentChunk + chunkSize - 1) + "\n");
bytes << "bytes=" << currentChunk << "-" << currentChunk + chunkSize - 1;
session.SetHeader(cpr::Header{{"range", bytes.str()}});
It'll be much more easy to review if you send a PR right away.
I think he is not sending a PR since despite his changes it is not working yet.
Hmm, applying this diff file (manually, thanks a lot git apply for never working) makes it work on the Garuda Linux ISO file I tested this on: https://builds.garudalinux.org/iso/garuda/dr460nized/210324/garuda-dr460nized-linux-zen-210324.iso.zsync
I've noticed while compiling this in cygwin that this is sometimes wrong and uses 32 bit stuff instead: https://github.com/AppImage/zsync2/blob/86cfd3a1d6a27483ec40edd62c1a6bd409cbbe5d/src/format_string.h#L24-L36
Forcing it to use 64 bit stuff fixed any issues I had on the cygwin compiled version.
This patch goes in the right direction, but it actually doesn't solve the issue. See my comments in #59. A fix must use fixed 64-bit types. size_t and off_t are compiler-dependent and typically just 32-bit in size on 32-bit machines.