datasets
datasets copied to clipboard
mocking.mock_data broken by https://github.com/tensorflow/datasets/commit/e8d8966e888667091875fd885ab0803a6dc5a383
/!\ PLEASE INCLUDE THE FULL STACKTRACE AND CODE SNIPPET
Short description https://github.com/tensorflow/datasets/commit/e8d8966e888667091875fd885ab0803a6dc5a383 broken mock_data().
Environment information
-
Operating System: Linux
-
Python version: 3.7.7
-
tensorflow-datasets
/tfds-nightly
version: 4.5.2.dev202203220044 -
Does the issue still exists with the last
tfds-nightly
package (pip install --upgrade tfds-nightly
) ?
Reproduction instructions
with mock_data(num_examples=40):
builder = tfds.builder("imagenet2012", "...")
...
Link to logs
input_tfds.py:74: in _build_dataset
builder = tfds.builder(cfg.dataset_name, data_dir=cfg.data_dir)
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow_datasets/core/load.py:149: in builder
community.community_register.has_namespace(name.namespace)):
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow_datasets/core/community/registry.py:124: in has_namespace
return namespace in self.registers_per_namespace
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow_datasets/core/community/registry.py:121: in registers_per_namespace
return self.namespace_config.registers_per_namespace()
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow_datasets/core/community/registry.py:93: in registers_per_namespace
config = toml.loads(self.config_path.read_text())
/miniconda/envs/py377/lib/python3.7/site-packages/etils/epath/abstract_path.py:141: in read_text
return f.read()
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py:114: in read
self._preread_check()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <tensorflow.python.platform.gfile.GFile object at 0x7f4e18542450>
def _preread_check(self):
if not self._read_buf:
if not self._read_check_passed:
raise errors.PermissionDeniedError(None, None,
"File isn't open for reading")
self._read_buf = _pywrap_file_io.BufferedInputStream(
> compat.path_to_str(self.__name), 1024 * 512)
E tensorflow.python.framework.errors_impl.NotFoundError: /miniconda/envs/py377/lib/python3.7/site-packages/tensorflow_datasets/community-datasets.toml; No such file or directory
/miniconda/envs/py377/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py:77: NotFoundError
Expected behavior tfds does not try to read community-datasets.toml.
Additional context Add any other context about the problem here.
Similar issues with load
and list_builders
methods running on colab from the tfds-nightly package. Output for tfds.list_builders()
is below:
[<ipython-input-5-89a978348cb8>](https://localhost:8080/#) in <module>()
----> 1 tfds.list_builders()
6 frames
[/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/load.py](https://localhost:8080/#) in list_builders(with_community_datasets)
64 if with_community_datasets:
65 if visibility.DatasetType.COMMUNITY_PUBLIC.is_available():
---> 66 datasets += community.community_register.list_builders()
67 return datasets
68
[/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/community/registry.py](https://localhost:8080/#) in list_builders(self)
129 def list_builders(self) -> List[str]:
130 builders = []
--> 131 for registers in self.registers_per_namespace.values():
132 for register in registers:
133 builders.extend(register.list_builders())
[/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/community/registry.py](https://localhost:8080/#) in registers_per_namespace(self)
119 def registers_per_namespace(
120 self) -> Mapping[str, List[register_base.BaseRegister]]:
--> 121 return self.namespace_config.registers_per_namespace()
122
123 def has_namespace(self, namespace: str) -> bool:
[/usr/local/lib/python3.7/dist-packages/tensorflow_datasets/core/community/registry.py](https://localhost:8080/#) in registers_per_namespace(self)
91 RuntimeError: when the config contains errors.
92 """
---> 93 config = toml.loads(self.config_path.read_text())
94 registers_per_namespace = {}
95 for namespace, path_or_paths in config['Namespaces'].items():
[/usr/local/lib/python3.7/dist-packages/etils/epath/abstract_path.py](https://localhost:8080/#) in read_text(self, encoding)
139 """Reads contents of self as bytes."""
140 with self.open('r', encoding=encoding) as f:
--> 141 return f.read()
142
143 # ====== Write methods ======
[/usr/local/lib/python3.7/dist-packages/tensorflow/python/lib/io/file_io.py](https://localhost:8080/#) in read(self, n)
112 string if in string (regular) mode.
113 """
--> 114 self._preread_check()
115 if n == -1:
116 length = self.size() - self.tell()
[/usr/local/lib/python3.7/dist-packages/tensorflow/python/lib/io/file_io.py](https://localhost:8080/#) in _preread_check(self)
75 "File isn't open for reading")
76 self._read_buf = _pywrap_file_io.BufferedInputStream(
---> 77 compat.path_to_str(self.__name), 1024 * 512)
78
79 def _prewrite_check(self):
NotFoundError: /usr/local/lib/python3.7/dist-packages/tensorflow_datasets/community-datasets.toml; No such file or directory```
Hi! Thanks for reporting this. This should now be fixed (that file wasn't included in the package, so we added it). Could you retry? Kind regards, Tom