FCIS
FCIS copied to clipboard
Error while training with coco dataset: h5py unable to open file
I'm trying to run: python experiments/fcis/fcis_end2end_train_test.py --cfg experiments/fcis/cfgs/resnet_v1_101_coco_fcis_end2end_ohem.yaml
It trains until batch [1000] and then I get the following error:
Epoch[0] Batch [1000] Speed: 2.92 samples/sec Train-RPNAcc=0.873100, RPNLogLoss=0.307023, RPNL1Loss=0.167374, FCISAcc=0.716729, FCISAccFG=0.000708, FCISLogLoss=2.082165, FCISL1Loss=0.089456, FCISMaskLoss=0.632843,
Exception in thread Thread-71:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "experiments/fcis/../../fcis/../lib/utils/PrefetchingIter.py", line 60, in prefetch_func
self.next_batch[i] = self.iters[i].next()
File "experiments/fcis/../../fcis/core/loader.py", line 99, in next
self.get_batch_parallel()
File "experiments/fcis/../../fcis/core/loader.py", line 161, in get_batch_parallel
rst = self.parfetch(roidb)
File "experiments/fcis/../../fcis/core/loader.py", line 183, in parfetch
gt_masks = get_gt_masks(roidb[0]['cache_seg_inst'], data['im_info'][0,:2].astype('int'))
File "experiments/fcis/../../fcis/../lib/mask/mask_transform.py", line 25, in get_gt_masks
gt_masks = hkl.load(gt_mask_file)
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 616, in load
h5f = file_opener(fileobj)
File "/usr/local/lib/python2.7/dist-packages/hickle.py", line 154, in file_opener
h5f = h5.File(filename, mode)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 272, in init
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/files.py", line 92, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2684)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-4rPeHA-build/h5py/_objects.c:2642)
File "h5py/h5f.pyx", line 76, in h5py.h5f.open (/tmp/pip-4rPeHA-build/h5py/h5f.c:1930)
IOError: Unable to open file (File signature not found)
Anyone can help me with this?
Maybe you can try printing the path before you read the file and see whether the one leads to this error have some problem. It seems that the program fails to read this file, which may because the file doesn't exist or you don't have the permission to get access to it. Add fixed random seed in TrainDataLoader may help you locate that file.
A solution is given in #11 but it did not work for me, all my images and hkl files (stored as cache) did have size >0. My solution was deleting the cache hkl files and launching the training again so they are created again hopefully without error.
"My solution was deleting the cache hkl files and launching the training again so they are created again hopefully without error",what did this mean?