[WIP] Fix for llm examples with openvino-nightly

Open ljaljushkin opened this issue 1 year ago • 0 comments

Changes

Explicit enabling of dynamic quantization of activations and updating metrics in the weight compression examples.

It's possible to adapt compression examples for new openvino-nightly by explicitly turning off the dynamic quantization of activations via

OVModelForCausalLM.from_pretrained(OUTPUT_DIR, ov_config={"DYNAMIC_QUANTIZATION_GROUP_SIZE": "0"})

The successful 45th run of test examples is proving that. But examples with dynamic quantization should make validation faster, that's why this PR enables it and updates the metrics.

Reason for changes

OpenVINO 2023.2 has a dynamic quantization of activation enabled by default. (https://github.com/openvinotoolkit/openvino/pull/25054) and that's affects accuracy/metrics in the weight compression examples/tests, since before it was disabled by default.

Related tickets

CVS-134931 CVS-145701

Tests

tests/cross_fw/examples/test_examples.py -k llm

run of test examples:

[x] 46 - with openvino-nightly
[ ] 48 - with current openvino

Jul 03 '24 15:07 ljaljushkin