nncf
nncf copied to clipboard
Out of memory for magnitude sparsity algo with stable diffusion model
My NNCF version is: 2.5.0.dev0+444ca7da
Using optimum for stable diffusion base on: https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/stable-diffusion instead of using quantization, I changed it to magnitude sparsity shown below.
"compression": [
{
"algorithm": "magnitude_sparsity",
"sparsity_init": 0.1,
"params": {
"schedule": "multistep",
"multistep_steps": [
5,
10,
20,
30,
40
],
"multistep_sparsity_levels": [
0.1,
0.2,
0.3,
0.4,
0.5,
0.6
],
"sparsity_freeze_epoch": 50
},
},
],
however I am getting error below, which seems like in nncf/torch/sparsity/magnitude/algo.py
with sort function for all weights causes out of memory issue. Is there any way that I can change to make it run on multiple GPUs? Thank you!
Traceback (most recent call last):
File "../stable_diffusion/optimum-intel/examples/openvino/stable-diffusion/train_text_to_image_qat.py", line 1185, in <module>
main()
File "../stable_diffusion/optimum-intel/examples/openvino/stable-diffusion/train_text_to_image_qat.py", line 1003, in main
compression_controller, unet = create_compressed_model(unet, nncf_config)
File "/python3.9/site-packages/nncf/telemetry/decorator.py", line 71, in wrapped
retval = fn(*args, **kwargs)
File "/python3.9/site-packages/nncf/torch/model_creation.py", line 129, in create_compressed_model
compression_ctrl = builder.build_controller(compressed_model)
File "/python3.9/site-packages/nncf/torch/compression_method_api.py", line 165, in build_controller
ctrl = self._build_controller(model)
File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 50, in _build_controller
return MagnitudeSparsityController(model, self._sparsified_module_info, self.config)
File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 83, in __init__
self.set_sparsity_level(sparsity_init)
File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 143, in set_sparsity_level
threshold = self._select_threshold(sparsity_level, target_sparsified_module_info_list)
File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 153, in _select_threshold
all_weights_tensor, _ = torch.cat(all_weights).sort()
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.67 GiB (GPU 0; 39.41 GiB total capacity; 36.06 GiB already allocated; 775.56 MiB free; 36.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
@vshampor, @ljaljushkin can you please take a look?
"compression": [{
"algorithm": "magnitude_sparsity",
...
"params": {
"sparsity_level_setting_mode": "local",
...
}}]
As a quick workaround, you could try to set sparsity rate "locally" for each layer. IMO, it wasn't properly implemented, I've made a fix for this mode: https://github.com/openvinotoolkit/nncf/pull/1933
Ref. 138685
@xiao1228 , do you have any follow up? Has the suggestion from @ljaljushkin helped?