data
data copied to clipboard
Enforce the version of Protobuf via the optional dependency
🐛 Describe the bug
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
https://github.com/pytorch/data/blob/d9bbbecf64d0149795dc65ba390b50bc9e176e95/torchdata/datapipes/iter/util/protobuf_template/_tfrecord_example_pb2.py#L39
Versions
Lastest version.
IIRC, the last time I checked the Tensorflow still requires protobuf version < 3.20
I think you are right, but it might be better to add version control and restrict the protobuf version to 3.20.
by the way can I get the number of tfrecord?
It makes sense to add optional dependency specification to TorchData. https://setuptools.pypa.io/en/latest/userguide/dependency_management.html#optional-dependencies