VQA_ReGAT icon indicating copy to clipboard operation
VQA_ReGAT copied to clipboard

features/model that are interrupted during download doesn't continue from the last checkpoint

Open ifmaq1 opened this issue 3 years ago • 5 comments

I am using ubuntu 16.04 with tesla servers I am trying to download the models and features using the links that have been given in the download.sh file. However after sometime the server's connection keeps on disconnecting and when I try to continue the remaining file from the last checkpoint it doesn't work and starts from the scratch.

wget -c https://convaisharables.blob.core.windows.net/vqa-regat/pretrained_models.zip

could you tell me why isn't it working. As I have used the -c flag with other links in the past as well and they worked well. They create a wget-log to take the last checkpoint of downloadable file.

The error of disconnection

pretrained_models.zip 94%[=================> ] 2.63G 243KB/s in 2h 4m 2021-04-14 01:09:30 (368 KB/s) - Read error at byte 2822012928/2970938152 (Connection reset by peer). Retrying. --2021-04-14 01:09:33-- (try: 4) https://convaisharables.blob.core.windows.net/vqa-regat/pretrained_models.zip Connecting to convaisharables.blob.core.windows.net (convaisharables.blob.core.windows.net)|13.77.184.64|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2970938152 (2.8G) [application/x-zip-compressed] Saving to: ‘pretrained_models.zip’ pretrained_models.zip 0%[ ] 13.16M 449KB/s eta 3h 11m ^

ifmaq1 avatar Apr 13 '21 20:04 ifmaq1

I just re-tested the connect and could not reproduce the error. I suspect it is due to unstable internet connection. image

There is not much we can do from our side to debug this problem. Could you try again? We had faced similar issues when downloading data for other projects, but succeeded after multiple tries.

linjieli222 avatar Apr 15 '21 17:04 linjieli222

@linjieli222 thank you for your great code. I have also encountered the same issues, especially for downlaoding the big files(adaptive and fixed features). Could you please share you feature post-processing code to generate feature files (train.hdf5, val.hdf5,..) from bottom-up-attention features? Thank you!

dami23 avatar May 11 '21 07:05 dami23

I just re-tested the connect and could not reproduce the error. I suspect it is due to unstable internet connection. image

There is not much we can do from our side to debug this problem. Could you try again? We had faced similar issues when downloading data for other projects, but succeeded after multiple tries.

@linjieli222 I failed to download the big files (adaptive and fixed features) even with multiple attempts. The download speed is very low (~500 KB/sec average). There is no issue with my internet. The server is slow. Can you please suggest a solution to fix this?

nikhilbchilwant avatar Jun 05 '21 14:06 nikhilbchilwant

@nikhilbchilwant @ifmaq1 @dami23 if you are still encountering issues with downloading. I suggest you to try download with azcopy.

An example command to download the pretrained_models.zip is:

<path to azcopy> cp "https://convaisharables.blob.core.windows.net/vqa-regat/pretrained_models.zip" <dest_folder>

After running the command, you should be able to see something like this:

image

linjieli222 avatar Jun 07 '21 22:06 linjieli222

@linjieli222 thank you for your great code. I have also encountered the same issues, especially for downlaoding the big files(adaptive and fixed features). Could you please share you feature post-processing code to generate feature files (train.hdf5, val.hdf5,..) from bottom-up-attention features? Thank you!

This helped me to generate feature files. Also create your own dictionary.

nikhilbchilwant avatar Jun 12 '21 10:06 nikhilbchilwant