xcenternet icon indicating copy to clipboard operation
xcenternet copied to clipboard

COCO dataset to large to extract

Open rodneytai opened this issue 3 years ago • 1 comments

I was intended to train on coco dataset but after I started the command, it shows the following message:

Traceback (most recent call last):
  File "train.py", line 98, in <module>
    train_dataset, train_examples = dataset.load_train_datasets()
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\xcenternet\datasets\coco_dataset.py", line 11, in load_train_datasets
    dataset_train, tinfo = self._load_dataset(name="coco/2017", split="train")
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\xcenternet\datasets\coco_dataset.py", line 23, in _load_dataset
    dataset, info = tfds.load(
  File "H:\DL\anaconda\envs\tfgpu\lib\site-packages\wrapt\wrappers.py", line 566, in __call__
    return self._self_wrapper(self.__wrapped__, self._self_instance,
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\api_utils.py", line 53, in disallow_positional_args_dec
    return fn(*args, **kwargs)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\registered.py", line 339, in load
    dbuilder.download_and_prepare(**download_and_prepare_kwargs)
  File "H:\DL\anaconda\envs\tfgpu\lib\site-packages\wrapt\wrappers.py", line 605, in __call__
    return self._self_wrapper(self.__wrapped__, self._self_instance,
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\api_utils.py", line 53, in disallow_positional_args_dec
    return fn(*args, **kwargs)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\dataset_builder.py", line 362, in download_and_prepare
    self._download_and_prepare(
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1070, in _download_and_prepare
    super(GeneratorBasedBuilder, self)._download_and_prepare(
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\dataset_builder.py", line 932, in _download_and_prepare
    for split_generator in self._split_generators(
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\object_detection\coco.py", line 245, in _split_generators
    extracted_paths = dl_manager.download_and_extract({
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\download\download_manager.py", line 419, in download_and_extract
    return _map_promise(self._download_extract, url_or_urls)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\download\download_manager.py", line 462, in _map_promise
    res = utils.map_nested(_wait_on_promise, all_promises)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\utils\py_utils.py", line 145, in map_nested
    return {
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\utils\py_utils.py", line 146, in <dictcomp>
    k: map_nested(function, v, dict_only, map_tuple)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\utils\py_utils.py", line 161, in map_nested
    return function(data_struct)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\download\download_manager.py", line 446, in _wait_on_promise
    return p.get()
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\promise\promise.py", line 512, in get
    return self._target_settled_value(_raise=True)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\promise\promise.py", line 516, in _target_settled_value
    return self._target()._settled_value(_raise)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\promise\promise.py", line 226, in _settled_value
    reraise(type(raise_val), raise_val, self._traceback)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\six.py", line 703, in reraise
    raise value
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\promise\promise.py", line 844, in handle_future_result
    resolve(future.result())
  File "H:\DL\anaconda\envs\tfgpu\lib\concurrent\futures\_base.py", line 432, in result
    return self.__get_result()
  File "H:\DL\anaconda\envs\tfgpu\lib\concurrent\futures\_base.py", line 388, in __get_result
    raise self._exception
  File "H:\DL\anaconda\envs\tfgpu\lib\concurrent\futures\thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "C:\Users\lab30\AppData\Roaming\Python\Python38\site-packages\tensorflow_datasets\core\download\extractor.py", line 97, in _sync_extract
    raise ExtractError(msg)
tensorflow_datasets.core.download.extractor.ExtractError: Error while extracting /data/datasets/mscoco/downloads\images.cocodataset.org_zips_train2017aai7WOpfj5nSSHXyFBbeLp3tMXjpA_H3YD4oO54G2Sk.zip to /data/datasets/mscoco/downloads
\extracted\ZIP.images.cocodataset.org_zips_train2017aai7WOpfj5nSSHXyFBbeLp3tMXjpA_H3YD4oO54G2Sk.zip (file: None) : /data/datasets/mscoco/downloads\images.cocodataset.org_zips_train2017aai7WOpfj5nSSHXyFBbeLp3tMXjpA_H3YD4oO54G2Sk.zip;
 value too large

The whole error message and traceback look like this. It might be too large to extract? I don't know. Any solutions to this matter?

And I also found out that in train.py , the XimilarDataset is still imported while your last git (which is 2 months ago I supposed?) removed the dependency.

Last question, the command python train.py --dataset coco --model_type centernet --model_mode simple --log_dir results_coco &> coco.out & , I wonder what's the use of log_dir, and how should I set the argument?

Thanks in advanced!

rodneytai avatar Mar 17 '21 09:03 rodneytai

Hi and sorry for the late reply!

  1. I've never seen this exception. However, this isn't a problem specific to our package, but the tensorflow datasets.

  2. You're right about XimilarDataset, I did some cleaning and I hope it is OK now.

  3. log_dir is just a directory where checkpoints and tensorboard files will be saved during training. Maybe you're confused because of the &>. It's a bash output redirecting. It shouldn't be in the command, it's up to users to deal with the output. I'll remove it.

Hope it helped at least a bit! Libor

liborvaneksw avatar Mar 30 '21 11:03 liborvaneksw