optimum-intel
optimum-intel copied to clipboard
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
@helena-intel This works: ```python from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq, GenerationConfig, pipeline from pathlib import Path from optimum.intel.openvino import OVModelForSpeechSeq2Seq import openvino as ov import json model_id = "openai/whisper-small" model =...
# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...
### System Info ```shell optimum==1.17.1 openvino==2024.0.0 PyTorch==2.2.1+cu121 python-3.10.12 ``` ### Who can help? @echarlaix @michaelbenayoun ### Information - [ ] The official example scripts - [X] My own modified scripts...
# Support for execution with multi concurrency - Adding clone operation to model objects which creates new execution context without duplicating compiled_model and memory usage. It enables multi-concurrency in multithreaded...
This is a draft of the PR that should be merged after OpenVINO 2023.1 release
### This is just a draft PR for now to start a discussion. It modifies `forward` calls to create inference request every time instead of reusing only one, created along...
I'm having trouble exporting the `Helsinki-NLP/opus-mt-es-en` model for language translation into the optimised OpenVino IR format. Reading through the other issues within this repository highlighted this issue https://github.com/huggingface/optimum-intel/issues/188, which seems...
I have applied dynamic quantization to a `flan-t5-large` model. However, when I try to evaluate the generated summaries I get this error: ` RuntimeError: empty_strided not supported on quantized tensors...