httpdirfs
httpdirfs copied to clipboard
Mounted file incomplete
When I mount the following:
$ httpdirfs -f --single-file-mode --debug --no-range-check https://zenodo.org/records/17531362/files/goblint.zip goblint-svcomp26/
LinkTable_print: --------------------------------------------
LinkTable_print: LinkTable 0x65138d5ac5f0 for https://zenodo.org/records/17531362/files/goblint.zip
LinkTable_print: --------------------------------------------
LinkTable_print: 0 H 0 https://zenodo.org/records/17531362/files/goblint.zip
LinkTable_print: 1 F 22728953 goblint.zip https://zenodo.org/records/17531362/files/goblint.zip
LinkTable_print: --------------------------------------------
LinkTable_print: Invalid link count: 0
LinkTable_print: --------------------------------------------
and copy the mounted file:
cp goblint-svcomp26/goblint.zip goblint.zip
then the resulting file is obviously incomplete, even when comparing the size httpdirfs itself reports for it in the mount:
$ ll
total 78652
drwxrwxr-x 4 simmo simmo 4096 nov 8 13:16 ./
drwxrwxr-x 17 simmo simmo 307200 nov 8 12:36 ../
drwxr-xr-x 1 simmo simmo 0 jaan 1 1970 goblint-svcomp26/
-r--r--r-- 1 simmo simmo 17446810 nov 8 13:17 goblint.zip
$ ll goblint-svcomp26
total 22200
drwxr-xr-x 1 simmo simmo 0 jaan 1 1970 ./
drwxrwxr-x 4 simmo simmo 4096 nov 8 13:16 ../
-r--r--r-- 1 simmo simmo 22728953 nov 5 12:38 goblint.zip
Note that the running httpdirfs process printed:
fs_open: /goblint.zip
Warning:src/network.c:145:curl_process_msgs: HTTP 429, sleeping for 5 sec
Error:src/link.c:1009:Link_download: req_size: 131072, recv: 14234
Error:src/link.c:1009:Link_download: req_size: 131072, recv: 14234
Error:src/link.c:1009:Link_download: req_size: 131072, recv: 14234
Error:src/link.c:1009:Link_download: req_size: 131072, recv: 14234
fs_release: /goblint.zip
Looks like if there are HTTP 429 errors, then something gets screwed up and the accessed file actually appears incomplete. There are errors in httpdirfs output, but these errors aren't correctly communicated to cp, which appears to succeed without any errors.
This not only affects copying the entire file, but even trying to read any part of it:
$ unzip -l goblint-svcomp26/goblint.zip
Archive: goblint-svcomp26/goblint.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
This error is because the mounted file appears incomplete. If I download the entire file with wget, then it's unzippable without issues.