safetensors icon indicating copy to clipboard operation
safetensors copied to clipboard

Failed to load `.safetensors` as state dict with error from `torch.frombuffer` in `safetensors.torch.load`

Open RunDevelopment opened this issue 1 year ago • 9 comments

System Info

OS: Win 10 64 bit Python: 3.9.13 SafeTensors: 0.4.2

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Reproduction

Loading the attached failed.safetensors file with safetensors.torch.load_file directly works, but reading the file into a bytes object first and then loading it with safetensors.torch.load fails.

I get the following error:

Traceback (most recent call last):
  File "C:\Users\micha\Git\spandrel\test.py", line 10, in <module>
    state_dict = safetensors.torch.load(b)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\site-packages\safetensors\torch.py", line 338, in load
    return _view2torch(flat)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\site-packages\safetensors\torch.py", line 386, in _view2torch
    arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
ValueError: both buffer length (0) and count (-1) must not be 0

Same error with pytest formatting:

    def _view2torch(safeview) -> Dict[str, torch.Tensor]:
        result = {}
        for k, v in safeview:
            dtype = _getdtype(v["dtype"])
>           arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
E           ValueError: both buffer length (0) and count (-1) must not be 0

Steps to reproduce:

  1. Download failed.safetensros
  2. Read failed.safetensros into bytes object.
  3. Call safetensors.torch.load.

I used the following script to get the above error:

import safetensors.torch

file_path = "./failed.safetensors"
with open(file_path, "rb") as f:
    b = f.read()
state_dict = safetensors.torch.load(b)
print(state_dict.keys())

Expected behavior

safetensors.torch.load_file and safetensors.torch.load should produce the same result and load the state dict correctly.

RunDevelopment avatar Feb 16 '24 22:02 RunDevelopment

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Mar 18 '24 01:03 github-actions[bot]

Still a problem.

RunDevelopment avatar Mar 18 '24 03:03 RunDevelopment

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Apr 18 '24 01:04 github-actions[bot]

Still a problem.

RunDevelopment avatar Apr 18 '24 10:04 RunDevelopment

Ran into this issue as well. Seems like adding a check before (https://github.com/huggingface/safetensors/blob/079781fd0dc455ba0fe851e2b4507c33d0c0d407/bindings/python/py_src/safetensors/torch.py#L389) would workaround the issue. Check if the buffer is length 0, and if so, create an empty tensor instead of calling frombuffer.

Would the maintainers like a pull request to that effect?

EDIT:

Here's my patched version of that function (works for me, but not fully tested):

def _view2torch(safeview) -> dict[str, torch.Tensor]:
	result = {}
	for k, v in safeview:
		dtype = safetensors.torch._getdtype(v["dtype"])
		if len(v["data"]) == 0:
			assert all(x == 0 for x in v["shape"])
			arr = torch.empty(v["shape"], dtype=dtype)
		else:
			arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
		if sys.byteorder == "big":
			arr = torch.from_numpy(arr.numpy().byteswap(inplace=False))
		result[k] = arr

	return result

fpgaminer avatar May 03 '24 06:05 fpgaminer

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jun 03 '24 01:06 github-actions[bot]

Bump

fpgaminer avatar Jun 03 '24 04:06 fpgaminer

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Jul 04 '24 01:07 github-actions[bot]

Bamp

fpgaminer avatar Jul 04 '24 23:07 fpgaminer

Hi @fpgaminer,

Thanks for the workaround. I think this bug should be filed upstream in pytorch since torch.zeros((2, 0)) is valid, there is no reason for torch.frombuffer not to accept zero-length tensors.

That being said we can workaround in the meantime.

Narsil avatar Jul 31 '24 11:07 Narsil