safetensors icon indicating copy to clipboard operation
safetensors copied to clipboard

Incorrect serialization given only metadata

Open tommyip opened this issue 10 months ago • 0 comments

System Info

  • transformers version: 4.39.3
  • Platform: macOS-14.3.1-arm64-arm-64bit
  • Python version: 3.11.8
  • Huggingface_hub version: 0.22.2
  • Safetensors version: 0.4.3-dev.0
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.2 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: N/A
  • Using distributed or parallel set-up in script?: N/A

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Reproduction

Serialize a safetensors file with some metadata but no tensors

from safetensors import safe_open
from safetensors.torch import save_file

save_file({}, 'example.safetensors', {'key': 'value'})

safe_open('example.safetensors', framework='pt')

# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
# safetensors_rust.SafetensorError: Error while deserializing header: InvalidHeaderDeserialization

Expected behavior

Expect safe_open to succeed.

Actual file data:

> xxd example.safetensors 
00000000: 2800 0000 0000 0000 7b7d 2c22 5f5f 6d65  (.......{},"__me
00000010: 7461 6461 7461 5f5f 223a 7b22 6b65 7922  tadata__":{"key"
00000020: 3a22 7661 6c75 6522 7d7d 2020 2020 2020  :"value"}}      

Note the unexpected }, after the opening brace.

tommyip avatar Apr 14 '24 16:04 tommyip