safetensors
safetensors copied to clipboard
Fix incorrect serialization given only metadata
What does this PR do?
When saving a safetensors file with some metadata but no tensors, the JSON header is malformed.
from safetensors import safe_open
from safetensors.torch import save_file
save_file({}, 'example.safetensors', {'key': 'value'})
safe_open('example.safetensors', framework='pt')
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization
The serialized data:
$ xxd example.safetensors
00000000: 2800 0000 0000 0000 7b7d 2c22 5f5f 6d65 (.......{},"__me
00000010: 7461 6461 7461 5f5f 223a 7b22 6b65 7922 tadata__":{"key"
00000020: 3a22 7661 6c75 6522 7d7d 2020 2020 2020 :"value"}}
The issue comes from calling serde serialize_map
with an incorrect number of expected map entries (missing the count for __metadata__
). Strange that it serializes correctly if we have both tensors and metadata...
Fixes #466