nncf icon indicating copy to clipboard operation
nncf copied to clipboard

Out of memory for magnitude sparsity algo with stable diffusion model

Open xiao1228 opened this issue 1 year ago • 2 comments

My NNCF version is: 2.5.0.dev0+444ca7da Using optimum for stable diffusion base on: https://github.com/huggingface/optimum-intel/tree/main/examples/openvino/stable-diffusion instead of using quantization, I changed it to magnitude sparsity shown below.

        "compression": [
            {

                "algorithm": "magnitude_sparsity",
                "sparsity_init": 0.1,
                "params": {
                    "schedule": "multistep",
                    "multistep_steps": [
                        5,
                        10,
                        20,
                        30,
                        40
                    ],
                    "multistep_sparsity_levels": [
                        0.1,
                        0.2,
                        0.3,
                        0.4,
                        0.5,
                        0.6
                    ],
                    "sparsity_freeze_epoch": 50
                },
                
            },
        ],

however I am getting error below, which seems like in nncf/torch/sparsity/magnitude/algo.py with sort function for all weights causes out of memory issue. Is there any way that I can change to make it run on multiple GPUs? Thank you!

Traceback (most recent call last):
  File "../stable_diffusion/optimum-intel/examples/openvino/stable-diffusion/train_text_to_image_qat.py", line 1185, in <module>
    main()
  File "../stable_diffusion/optimum-intel/examples/openvino/stable-diffusion/train_text_to_image_qat.py", line 1003, in main
    compression_controller, unet = create_compressed_model(unet, nncf_config)
  File "/python3.9/site-packages/nncf/telemetry/decorator.py", line 71, in wrapped
    retval = fn(*args, **kwargs)
  File "/python3.9/site-packages/nncf/torch/model_creation.py", line 129, in create_compressed_model
    compression_ctrl = builder.build_controller(compressed_model)
  File "/python3.9/site-packages/nncf/torch/compression_method_api.py", line 165, in build_controller
    ctrl = self._build_controller(model)
  File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 50, in _build_controller
    return MagnitudeSparsityController(model, self._sparsified_module_info, self.config)
  File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 83, in __init__
    self.set_sparsity_level(sparsity_init)
  File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 143, in set_sparsity_level
    threshold = self._select_threshold(sparsity_level, target_sparsified_module_info_list)
  File "/python3.9/site-packages/nncf/torch/sparsity/magnitude/algo.py", line 153, in _select_threshold
    all_weights_tensor, _ = torch.cat(all_weights).sort()
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 9.67 GiB (GPU 0; 39.41 GiB total capacity; 36.06 GiB already allocated; 775.56 MiB free; 36.29 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

xiao1228 avatar Jun 22 '23 12:06 xiao1228

@vshampor, @ljaljushkin can you please take a look?

AlexKoff88 avatar Jun 22 '23 12:06 AlexKoff88

"compression": [{
"algorithm": "magnitude_sparsity",
...
"params": {
    "sparsity_level_setting_mode": "local",
    ...
}}]

As a quick workaround, you could try to set sparsity rate "locally" for each layer. IMO, it wasn't properly implemented, I've made a fix for this mode: https://github.com/openvinotoolkit/nncf/pull/1933

ljaljushkin avatar Jun 22 '23 15:06 ljaljushkin

Ref. 138685

avitial avatar Apr 16 '24 17:04 avitial

@xiao1228 , do you have any follow up? Has the suggestion from @ljaljushkin helped?

MaximProshin avatar Apr 17 '24 06:04 MaximProshin