ComfyUI
ComfyUI copied to clipboard
Update utils.py: fix very slow loading speed of .safetensors files
I'm not sure if this could have any downsides, but in my case .safetensors files now load twice as fast. This will be especially useful for those who store hundreds of gigabytes of models on hard drives.
Isn't this going to take 2x the memory to load them?
@comfyanonymous I don't know, I didn't notice an increase in memory load 🤔
This does not seem like a great idea, at least not in general for all systems, testing only the load_torch_file
function with
import sys
import resource
import time
memory_before = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss # This is the maximum resident set size used (in kilobytes)
time_before = time.perf_counter()
load_torch_file(sys.argv[1], # sd_xl_base_1.0.safetensors
safe_load=False, device=None)
time_after = time.perf_counter()
memory_after = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
print('memory after-before {:.0f}-{:.0f}MB = {:.0f}MB'.format(memory_before/1e3,
memory_after/1e3,
(memory_after-memory_before)/1e3))
print('time {:.3g}'.format(time_after-time_before))
I get with original code: sd = safetensors.torch.load_file(ckpt, device=device.type)
memory after-before 363-444MB = 81MB
time 0.773
and way worse with the modified code sd = safetensors.torch.load(open(ckpt, 'rb').read())
memory after-before 363-13923MB = 13560MB
time 4.44
on a AMD Ryzen 7 3700X, RTX 3060, Ubuntu 22.04
This seems like it would be more appropriate to discuss upstream with the safetensors library devs. What this change does is it reads the file fully into memory and then loads directly from that memory; I'm assuming load_file
is more memory-efficient and parses the file as it's being read from disk.
This should usually be about as fast or even faster than reading into memory first, but depending on what load_file
does, it is possible it might have a performance issue on some platforms.