datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Invalid Tensor Index 'ragged_flat_values' when using Sequence as top-level feature

Open hermannsblum opened this issue 4 years ago • 3 comments

Short description I'm trying to add a new dataset. Serialization succeeded, but when loading the data, the error below occurs.

TypeError: Only integers, slices (`:`), ellipsis (`...`), tf.newaxis (`None`) and scalar tf.int32/tf.int64 tensors are valid indices, got 'ragged_flat_values'  

Detailed stacktrace below, the error happens in the deserialization.

Environment information

  • Operating System: ubuntu 18.04
  • Python version: 3.6
  • tensorflow-datasets/tfds-nightly version: tfds-nightly
  • tensorflow/tensorflow-gpu/tf-nightly/tf-nightly-gpu version: tested on tensorflow 2.2.0 and 2.3.0

Reproduction instructions

https://github.com/hermannsblum/tf_datasets/blob/3dpw/tensorflow_datasets/human_pose/pose_3dpw.py

Link to logs

stack trace

Expected behavior The dataset to load, or an error at dataset building.

Additional context Add any other context about the problem here.

hermannsblum avatar Jul 29 '20 14:07 hermannsblum

The cause of the bug turns out to be passing a Sequence instead of a FeaturesDict as the top level feature to DatasetInfo. Therefore:

  • if features should always be encapsulated in a dict, this should be mentioned in the guide.
  • otherwise, this is a bug that should be fixed.

Can a maintainer shed light on which of these options is correct?

hermannsblum avatar Aug 03 '20 16:08 hermannsblum

Sequence({}) should be allowed at top-level feature, so this is likely a bug. Thank you for reporting

Conchylicultor avatar Aug 03 '20 16:08 Conchylicultor

Sequence(Scalar()) works, but Sequence(Sequence(Scalar(), length=4)) fails.

kyamagu avatar Feb 17 '23 15:02 kyamagu