YCB_Video_toolbox Dataset too big!

This dataset is too big to download on my system. Is there anyway you can provide a smaller version of the dataset for test purposes.

Thanks.

Apr 03 '18 14:04 mittalrajat

I was able to download it nonetheless. Thanks.

Apr 04 '18 18:04 mittalrajat

Somebody upload the data to baidu drive, you can spend 3 dollor to get a prime membership and download it in 5 hours.

Aug 25 '20 22:08 jih189

You can only Register to baidu with an asian phone number

Jiaming Hu [email protected] schrieb am Mi., 26. Aug. 2020, 00:29:

Somebody upload the data to baidu drive, you can spend 3 dollor to get a prime membership and download it in 5 hours.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yuxng/YCB_Video_toolbox/issues/2#issuecomment-680301209, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUF44JDBP2T3CQNXICWQYLSCQ3OLANCNFSM4EYUZ24A .

Aug 25 '20 22:08 kevinkit

no. you do not have to. If you do it on computer then yes. What you need to do is download baidu app from google play, and register there, then you do not need to provide chinese phone number.

Aug 25 '20 22:08 jih189

If anyone is looking to download the full YCB-V dataset we have a hosted version in S3 and can give you access. Just reach out to [email protected] with subject : YCB-V download

Oct 29 '20 17:10 iandewancker

Hi @iandewancker is the S3 bucket still active?

Jul 03 '22 16:07 saunair

您的邮件我已收到，我会及时处理。祝好！

Jul 03 '22 16:07 Santranx

For anyone still struggling with access to the dataset.

Here's how I managed to get it to work with accessible google drive link.

Have/get any google drive storage subscription so you won't get blocked from file access by "high file traffic" warning
Create symlink to dataset zipfile on your drive by clicking the icon in top right corner of shared file page
Create Google Colab instance and mount your google drive into file system

from google.colab import drive
drive.mount('/content/gdrive')

With console in front of your eyes and zipfile in file system, with less/more googling you should be fine from here.

In case you want to process the dataset in colab anyway, here's some more pitfalls to avoid.

The dataset zip seems to contain the same data twice. Usorted files in /data_syn and organized in /data/{video_num}.
Don't unzip the dataset into your google drive. Drive is extremely slow with accessing many small files. Either store it in uncompressed few-GB zip chunks, retrieve and unzip into colab instance as needed or do it the proper way with memory mapped files like HDF5.
Full extract will run out of RAM. Consider unzipping in chunks.

%%capture
#get list of all files contained in zipfile
all_files = !unzip -l {zipfile}
all_files = [line[30:] for line in all_files[3:][:-2]]

CHUNK = 100
n_chunks = math.ceil(len(all_files)/CHUNK)
for i in range(n_chunks):
    chunk = all_files[i*CHUNK:(i+1)*CHUNK]
    chunk = ' '.join(chunk)
    !unzip -n {zipfile} {chunk} -d {unzip_path} 1>/dev/null

Sep 01 '23 06:09 GloomyC

您的邮件我已收到，我会及时处理。祝好！

Sep 01 '23 06:09 Santranx

YCB_Video_toolbox YCB_Video_toolbox copied to clipboard

Dataset too big!

For anyone still struggling with access to the dataset.

In case you want to process the dataset in colab anyway, here's some more pitfalls to avoid.

YCB_Video_toolbox
YCB_Video_toolbox copied to clipboard