two-stream-action-recognition icon indicating copy to clipboard operation
two-stream-action-recognition copied to clipboard

Dataset

Open jqsun98 opened this issue 4 years ago • 24 comments

Hello, do you have RGB images and optical flow images extracted from UCF101 and HMDB51? I have downloaded the images processed by Feichtenhofer using google colab but it seems the zip file is broken and I cannot get images. So could you share your data with me through google drive? Thanks a lot!

jqsun98 avatar Apr 11 '20 10:04 jqsun98

Unfortunately, I don't have them on my drive, but I believe there's something went wrong while downloading them, They are actually incomplete since Colab storage is now less the 70GB while the dataset for RGB images are about 33GB compressed so you need at least 120GB to decompress them, I suggest executing WGET command on your mounted drive space if you enough space on your drive this may work.

Step 1: Mount your drive Step 2: change directory to your drive Step 3: execute WGET in this directory if you have about 80GB free on your drive

Good Luck.

mohammed-elkomy avatar Apr 11 '20 11:04 mohammed-elkomy

The problem comes when I extract them from zip file. Actually, I use WGET command on my colab and store the three ucf101_jpegs_256.zip.* file separately and then use CAT command to concat them to a single zip file, which is about 27G. However, "inflating: jpegs_256/v_Archery_g02_c07/frame000056.jpg error: zipfile read error" occurs when I unzip this 27G file.

jqsun98 avatar Apr 11 '20 11:04 jqsun98

Yes, you can't really decompress them on Colab, the instance doesn't have enough storage, you need at least 100 GB free storage (while decompressing a 1GB file you actually need at least 2GB, 1GB for the zipped and at least 1 GB for the decompressed form)

mohammed-elkomy avatar Apr 11 '20 11:04 mohammed-elkomy

So you mean that I need more than 100 GB storage on colab while my 115 GB google drive doesn't work here?

jqsun98 avatar Apr 11 '20 11:04 jqsun98

You drive may work, but you need to

Step 1: Mount your drive Step 2: change working directory to your drive using import os os.chdir("/my drive/bla bla")

Step 3: execute WGET in this directory if you have about 80GB free on your drive !wget LINK_TO_FIRST_ZIP_FILE !wget LINK_TO_SECOND_ZIP_FILE !wget LINK_TO_THIRD_ZIP_FILE !CAT THEM >> all.zip !remove FIRST, SECOND, THIRD ZIP !unzip all.zip !rm all.zip

mohammed-elkomy avatar Apr 11 '20 11:04 mohammed-elkomy

Yep, I have done the above steps but the error occurs when I carried out the "!unzip all.zip" (my command is "!unzip ucf101_jpegs_256.zip") in Step3.

jqsun98 avatar Apr 11 '20 12:04 jqsun98

Are you sure it's downloaded to your drive, and the Colab storage bar isn't full?

image

mohammed-elkomy avatar Apr 11 '20 12:04 mohammed-elkomy

Hmm, I'm confused. I suggest deleting the zipped files and download, cat and decompress them again.

mohammed-elkomy avatar Apr 11 '20 12:04 mohammed-elkomy

Yeah, I'm sure it's not full. I don't use GPU when preparing dataset and the avilable storage of Disk is about 77G

jqsun98 avatar Apr 11 '20 12:04 jqsun98

Actually, I tried it more than 20 times.

jqsun98 avatar Apr 11 '20 12:04 jqsun98

:( Can you check the sizes of each downloaded file?

mohammed-elkomy avatar Apr 11 '20 12:04 mohammed-elkomy

They are 9.1G, 9.1G and 8.9G respectively and the all.zip is about 27.05G.

jqsun98 avatar Apr 11 '20 12:04 jqsun98

And you see errors when decompressing the (cat) concatenated file?

mohammed-elkomy avatar Apr 11 '20 12:04 mohammed-elkomy

Yes. And I also tried "!jar xvf" command but it showed "Input/Output error".

jqsun98 avatar Apr 11 '20 12:04 jqsun98

Hmm, It seems really broken. My last idea is to wait a bit for Colab to sync with drive, It took me a few seconds to see them synced even for MB files, I suggest cleaning your drive and then download the 3 files then wait for a few minutes(I suggest 30 mins) then run the cat command and also wait again to get the compressed file and then decompress and have a lunch break :D

I say that since the links are mostly not broken.

I really want to help you :(

mohammed-elkomy avatar Apr 11 '20 13:04 mohammed-elkomy

OK, thank you for your advice.

jqsun98 avatar Apr 11 '20 13:04 jqsun98

I didn't encounter those issues when working on it since Colab py3-GPU instance had about 300GBs !

mohammed-elkomy avatar Apr 11 '20 13:04 mohammed-elkomy

What do you mean 300 GBs? Is it your Google Drive storage?

jqsun98 avatar Apr 11 '20 13:04 jqsun98

No, the local storage for colab instance itself.

mohammed-elkomy avatar Apr 11 '20 13:04 mohammed-elkomy

How to get 300 GBs? Is it Colab Pro? Here, I encounter a new problem: "Buffered data was truncated after reaching the output size limit."

jqsun98 avatar Apr 11 '20 13:04 jqsun98

How to get 300 GBs? Is it Colab Pro? That was the norm before :(

Here, I encounter a new problem: "Buffered data was truncated after reaching the output size limit." Drive API has limits on Quota, I think you exceeded the limits :(

mohammed-elkomy avatar Apr 11 '20 13:04 mohammed-elkomy

I think it's a little bit difficult for me to finish my task with colab and I may be supposed to use a Linux machine.

jqsun98 avatar Apr 11 '20 13:04 jqsun98

why not try hmdb51 on local colab storage?

mohammed-elkomy avatar Apr 11 '20 13:04 mohammed-elkomy

The error also occurs.

jqsun98 avatar Apr 11 '20 13:04 jqsun98