Alexander Suslov
Alexander Suslov
### Changes - Added support for compression of f8e4m3 and f8e5m2 weights into 4-bit types. - Added skipping compression of f8e4m3 and f8e5m2 weights into 8-bit types. - Added support...
### Context A “stateful model” is a model that implicitly preserves data between two consecutive inference calls such as KV cache for LLMs ([more details](https://docs.openvino.ai/nightly/openvino-workflow/running-inference/inference-request/stateful-models.html#)). Using a stateful model in...
### Changes ### Reason for changes ### Related tickets ### Tests
### Changes ### Reason for changes ### Related tickets ### Tests