LPN
LPN copied to clipboard
Some questions about the prepare_cvact.py
Hello, I had some problems while preparing for CVACT, as follows,
**dataset index: 218255 dataset unexist pair: ['G:/datasets/ANU_data_small/streetview/HkiKa_k0d5RXDxW14D_A1A_grdView.jpg', 'G:/datasets/ANU_data_small/satview_polish/HkiKa_k0d5RXDxW14D_A1A_satView_polish.jpg']**
This kind of prompt appeared many times. After processing, the 70G data finally generated only a 600M data. Can you tell me how to solve this problem?
Hello. For the first problem, you do not need to care about the prompt. The prompt is caused by the dataset. For the second problem, it is also normal for the raw data to become smaller due to some processing. I do not know whether the 600M data belongs to the training set or the val set. Can you give me the detailed information about the memory occupation of training set and val set. @Allen-lz
I re-executed the data processing and found that an error occurred during the process of decompressing ANU_data_small.tar.gz. This resulted in a data set that was too small.
` tar: Skipping to next header tar: Archive contains ‘}߅\317\332\363gf\006\017ۮ’ where numeric off_t value expected tar: Skipping to next header tar: Archive contains ‘φ\373\304\003\221\324\343\234\347\2478’ where numeric off_t value expected tar: Archive contains ‘\357\342+K\254\217\335\333\337ë\254’ where numeric off_t value expected tar: Archive contains ‘\336\317t\265\365?\224\351c\226\v\275’ where numeric off_t value expected tar: Skipping to next header tar: Archive contains ‘1\361\226\252u\313\373\237\0\374:\037’ where numeric off_t value expected tar: Archive contains ‘\251a\247\262E&\373\333Ź\206_’ where numeric off_t value expected tar: Archive contains ‘\373|\036\321\3760x\202?\020\334\374’ where numeric off_t value expected tar: Archive contains ‘k\372\274Zwښ\347\354\207H\270’ where numeric off_t value expected tar: Archive contains ‘\252\370SHK/\v\350\227(\0323’ where numeric off_t value expected tar: Archive contains ‘Yd\374;\217l\363\374\271S\246^’ where numeric off_t value expected tar: Archive contains ‘\274]\254XKw\341]:\v\254\225’ where numeric off_t value expected tar: Archive contains ‘컾\276\233\276\237\257\336\025)\373’ where numeric off_t value expected tar: Archive contains ‘\245\0336\256\222\373=?\256\306\027}’ where numeric off_t value expected tar: Archive contains ‘8\253Rxv\240\333m\362\255o\246’ where numeric off_t value expected tar: Archive contains ‘4\322xgH\267\325n\315ֱ\373’ where numeric off_t value expected tar: Archive contains ‘?\361+\366\230\360\226\255c\017\2064’ where numeric off_t value expected tar: Archive contains ‘?\356\366~\226\364ӥ\377\0\377\331’ where numeric off_t value expected tar: Archive contains ‘V\217\354\262.P\264s0i<\246’ where numeric off_t value expected tar: Archive contains obsolescent base-64 headers tar: Archive contains ‘+\2229R\001洭\364\273ɐ’ where numeric off_t value expected tar: Archive contains ‘\2142\301%\253\334\335ʒ\264q\262’ where numeric off_t value expected tar: Archive contains ‘\216\313\304\032\234ڥ\335Ƈ\252\244’ where numeric off_t value expected tar: Archive contains ‘\242\272[\305\354PiX\221R\021\366’ where numeric off_t value expected tar: Archive contains ‘\226&\021\005\375ۯ\356\036\vĂ’ where numeric off_t value expected tar: Skipping to next header
gzip: stdin: invalid compressed data--crc error
gzip: stdin: invalid compressed data--length error tar: A lone zero block at 151070495 tar: Child returned status 1 tar: Error is not recoverable: exiting now
` @wtyhub
I am sorry and I do not know how to solve this problem. I also encountered this error and I re-downloaded this dataset at that time to solve this error. You also can try to solve this error via Google. When you unzip this dataset correctly, you will see two folders, one containing 44415 images and one containing 44416 images. @Allen-lz
oh~, thank you very much for your prompt reply, I will also try it again.