nncf
nncf copied to clipboard
Activation Sparsity OV backend
Changes
Extending #2683 with implementation for OV backend
Model | Backend | Compression | Sparsity Parameters | Wikitext word perplexity |
---|---|---|---|---|
meta-llama/Llama-2-7b-hf | OV | - | - | 8.71 |
meta-llama/Llama-2-7b-hf | OV | - | 25% (up/gate32%+down52%) | 9.06 |
meta-llama/Llama-2-7b-hf | PT | - | 25% (up/gate32%+down52%) | 9.06 |
meta-llama/Llama-2-7b-hf | OV | INT8_asym | - | 8.71 |
meta-llama/Llama-2-7b-hf | OV | INT8_asym | 25% (up/gate32%+down52%) | 9.07 |
meta-llama/Llama-2-7b-hf | PT | INT8_asym | 25% (up/gate32%+down52%) | 9.07 |
meta-llama/Llama-2-7b-hf | OV | INT4_default (sym=True, ratio=0.6) | - | 9.08 |
meta-llama/Llama-2-7b-hf | OV | INT4_default (sym=True, ratio=0.6) | 25% (up/gate32%+down52%) | 9.39 |
Reason for changes
Related tickets
147840