optimum-intel issues

OVModelForSpeechSeq2Seq fails with return_timestamps="word".

2

@helena-intel This works: ```python from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq, GenerationConfig, pipeline from pathlib import Path from optimum.intel.openvino import OVModelForSpeechSeq2Seq import openvino as ov import json model_id = "openai/whisper-small" model =...

barolo

Add test for INC examples

1

echarlaix

proper filter for applying bettertransformer

2

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

eaidova

Openvino e5-small not working after conversion model

2

### System Info ```shell optimum==1.17.1 openvino==2024.0.0 PyTorch==2.2.1+cu121 python-3.10.12 ``` ### Who can help? @echarlaix @michaelbenayoun ### Information - [ ] The official example scripts - [X] My own modified scripts...

ZeusFSX

bug

Concurrency support using model clone

2

# Support for execution with multi concurrency - Adding clone operation to model objects which creates new execution context without duplicating compiled_model and memory usage. It enables multi-concurrency in multithreaded...

dtrawins

Added hybrid quantization for seq2seq models

1

This is a draft of the PR that should be merged after OpenVINO 2023.1 release

AlexKoff88

Create infer request per inference to enable concurrency

2

### This is just a draft PR for now to start a discussion. It modifies `forward` calls to create inference request every time instead of reusing only one, created along...

mzegla

OVModelForSeq2SeqLM with Helsinki-NLP/opus-mt-es-en has slow inference times when exported to OpenVino

2

I'm having trouble exporting the `Helsinki-NLP/opus-mt-es-en` model for language translation into the optimised OpenVino IR format. Reading through the other issues within this repository highlighted this issue https://github.com/huggingface/optimum-intel/issues/188, which seems...

tsmith023

Export the decoder only once for seq2seq models

1

echarlaix

Quantized `flan-t5-large` RuntimeError - empty_strided not supported on quantized tensors yet

2

I have applied dynamic quantization to a `flan-t5-large` model. However, when I try to evaluate the generated summaries I get this error: ` RuntimeError: empty_strided not supported on quantized tensors...

jmdu99

optimum-intel
optimum-intel copied to clipboard

Metadata

OVModelForSpeechSeq2Seq fails with return_timestamps="word".

Add test for INC examples

proper filter for applying bettertransformer

Openvino e5-small not working after conversion model

Concurrency support using model clone

Added hybrid quantization for seq2seq models

Create infer request per inference to enable concurrency

OVModelForSeq2SeqLM with Helsinki-NLP/opus-mt-es-en has slow inference times when exported to OpenVino

Export the decoder only once for seq2seq models

Quantized `flan-t5-large` RuntimeError - empty_strided not supported on quantized tensors yet

← Metadata

Owner

Metadata

optimum-intel optimum-intel copied to clipboard

Metadata

← Metadata

Owner

Metadata

optimum-intel
optimum-intel copied to clipboard