safetensors
safetensors copied to clipboard
Load incorrect element values in bfloat16 tensor on big-endian
System Info
>>> safetensors.__version__
'0.4.2'
Information
- [ ] The official example scripts
- [X] My own modified scripts
Reproduction
When I executed transformer models in bfloat16 at HF, I got the incorrect result on s390x. I realized values in weights are different between x86 and s390x. The following is a small reproduction.
Execute the following program on x86
import torch
from safetensors import safe_open
from safetensors.torch import save_file
tensors = {
"weight1": torch.ones((8, 8), dtype=torch.bfloat16),
}
save_file(tensors, "bf16.safetensors")
read_tensors = {}
with safe_open("bf16.safetensors", framework="pt", device="cpu") as f:
for key in f.keys():
read_tensors[key] = f.get_tensor(key)
print(read_tensors)
Copy bf16.safetensors into s390x machine. Then, execute the following program
import torch
from safetensors import safe_open
read_tensors = {}
with safe_open("bf16.safetensors", framework="pt", device="cpu") as f:
for key in f.keys():
read_tensors[key] = f.get_tensor(key)
print(read_tensors)
The result is as follows:
{'weight1': tensor([[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06],
[7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06, 7.6294e-06,
7.6294e-06, 7.6294e-06]], dtype=torch.bfloat16)}
Expected behavior
The result on s390x should be as follows:
{'weight1': tensor([[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1., 1., 1.]], dtype=torch.bfloat16)}
My colleague is curious whether this code works well.
From PyTorch 2.1, byteswap() function is implemented. Does this function help big-endianness support?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
Fixed in https://github.com/huggingface/safetensors/pull/507