openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[GPU] Weightless caching

Open tkrupa-intel opened this issue 1 year ago • 14 comments

tkrupa-intel avatar Jul 25 '24 14:07 tkrupa-intel

build_jenkins

p-durandin avatar Sep 04 '24 14:09 p-durandin

build_jenkins

p-durandin avatar Sep 12 '24 11:09 p-durandin

build_jenkins

p-durandin avatar Sep 17 '24 05:09 p-durandin

Did you check accuracy? When I tested this PR with the resnet-18 static model, the outputs are different between non-caching and weightless-caching runs. Additionally, I temporarily commented out the below two lines for weightless cache blob loading: https://github.com/openvinotoolkit/openvino/blob/7cf05641b6b1b249904c96e02ac07ee384219bb4/src/plugins/intel_gpu/src/plugin/plugin.cpp#L311-L312

e-ddykim avatar Sep 20 '24 12:09 e-ddykim

Did you check accuracy? When I tested this PR with the resnet-18 static model, the outputs are different between non-caching and weightless-caching runs. Additionally, I temporarily commented out the below two lines for weightless cache blob loading:

https://github.com/openvinotoolkit/openvino/blob/7cf05641b6b1b249904c96e02ac07ee384219bb4/src/plugins/intel_gpu/src/plugin/plugin.cpp#L311-L312

Hi, thanks for letting me know about issues with this topology! I checked accuracy only for Stable Diffusion v1.5 and Llama-3-8b. I'm aware that there may be mismatches in other topologies (see discussion here: https://github.com/openvinotoolkit/openvino/pull/25731#discussion_r1756722874).

I'm aware that this check prevents correct import, I'll push the fix soon.

tkrupa-intel avatar Sep 20 '24 12:09 tkrupa-intel

Did you check accuracy? When I tested this PR with the resnet-18 static model, the outputs are different between non-caching and weightless-caching runs. Additionally, I temporarily commented out the below two lines for weightless cache blob loading:

https://github.com/openvinotoolkit/openvino/blob/7cf05641b6b1b249904c96e02ac07ee384219bb4/src/plugins/intel_gpu/src/plugin/plugin.cpp#L311-L312

Hi @e-ddykim, I get no mismatches with the current commit and a sample image. I tried to reproduce your issue with an old commit but got an exception throw instead of mismatches like you, so we have different setups. Could you please recheck to make sure it's also fixed on your setup?

tkrupa-intel avatar Oct 07 '24 15:10 tkrupa-intel

build_jenkins

p-durandin avatar Oct 08 '24 05:10 p-durandin

Did you check accuracy? When I tested this PR with the resnet-18 static model, the outputs are different between non-caching and weightless-caching runs. Additionally, I temporarily commented out the below two lines for weightless cache blob loading: https://github.com/openvinotoolkit/openvino/blob/7cf05641b6b1b249904c96e02ac07ee384219bb4/src/plugins/intel_gpu/src/plugin/plugin.cpp#L311-L312

Hi @e-ddykim, I get no mismatches with the current commit and a sample image. I tried to reproduce your issue with an old commit but got an exception throw instead of mismatches like you, so we have different setups. Could you please recheck to make sure it's also fixed on your setup?

Now, I can get correct results from resnet-18 with this PR.

e-ddykim avatar Oct 10 '24 09:10 e-ddykim

build_jenkins

p-durandin avatar Oct 10 '24 10:10 p-durandin

build_jenkins

p-durandin avatar Oct 15 '24 06:10 p-durandin

build_jenkins

p-durandin avatar Oct 15 '24 14:10 p-durandin

build_jenkins

p-durandin avatar Oct 16 '24 08:10 p-durandin

build_jenkins

p-durandin avatar Oct 16 '24 11:10 p-durandin

build_jenkins

p-durandin avatar Oct 18 '24 14:10 p-durandin

build_jenkins

p-durandin avatar Oct 21 '24 05:10 p-durandin