tfds.load() does not load datasets with a capital letter
Short description
Running tfds build Mk0_datasets_builder.py will save to ~/tensorflow_datasets/Mk0
When running tfds.load('Mk0', split='train', shuffle_files=True) to import it, the following error is given.
No registered data_dirs were found in:
- /home/user/tensorflow_datasets
Renaming the file to mk0 from Mk0 will allow it to load however.
Environment information
-
Operating System: Arch Linux
-
Python version: 3.11.5
-
`tensorflow-datasets version: 4.9.4
-
tensorflowversion: 2.14.0 -
Does the issue still exists with the last
tfds-nightlypackage (pip install --upgrade tfds-nightly) ? -
Yes
Reproduction instructions Build a dataset with a capital letter in the name then attempt to load with tfds
tfds.load('Mk0', split='train', shuffle_files=True)
Expected behavior
Either tfds build should automatically make the name lowercase or tfds.load() should be able to deal with uppercase letters
Thanks for reporting this issue!
This is indeed a real problem, we'll need to think if supporting uppercase in tfds.load is possible. In the meanwhile you should stick with lowercase for your dataset names. Sorry for the inconvenience.
Awesome. Thank you for looking into this.