datasets
datasets copied to clipboard
Invalid Tensor Index 'ragged_flat_values' when using Sequence as top-level feature
Short description I'm trying to add a new dataset. Serialization succeeded, but when loading the data, the error below occurs.
TypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got 'ragged_flat_values'
Detailed stacktrace below, the error happens in the deserialization.
Environment information
- Operating System: ubuntu 18.04
- Python version: 3.6
-
tensorflow-datasets
/tfds-nightly
version:tfds-nightly
-
tensorflow
/tensorflow-gpu
/tf-nightly
/tf-nightly-gpu
version: tested on tensorflow 2.2.0 and 2.3.0
Reproduction instructions
https://github.com/hermannsblum/tf_datasets/blob/3dpw/tensorflow_datasets/human_pose/pose_3dpw.py
Link to logs
Expected behavior The dataset to load, or an error at dataset building.
Additional context Add any other context about the problem here.
The cause of the bug turns out to be passing a Sequence
instead of a FeaturesDict
as the top level feature to DatasetInfo
. Therefore:
- if features should always be encapsulated in a dict, this should be mentioned in the guide.
- otherwise, this is a bug that should be fixed.
Can a maintainer shed light on which of these options is correct?
Sequence({})
should be allowed at top-level feature, so this is likely a bug. Thank you for reporting
Sequence(Scalar())
works, but Sequence(Sequence(Scalar(), length=4))
fails.