optimum-intel issues

Add JPQD evaluation notebook

3

Add JPQD evaluation notebook. Since JPQD QA takes about 12 hours to train, it doesn't make sense to do it in a notebook (if the browser crashes or the computer...

helena-intel

Running run_clm.py results GPU OOM.

Try to run nueral_compressor/language_modeling, as follows. it just same as on read.me. I have 24G GPU, but cause GPU OOM. This model is only 125M, is it normal? How much...

lcw99

``` #from transformers import AutoTokenizer, BloomModel from optimum.intel.openvino import OVModelForCausalLM from transformers import AutoTokenizer, BloomModel import torch from tqdm import tqdm from time import time from time import sleep model_str...

Zjq9409

fix backward compatibility for GenerationMode import

2

# What does this PR do? fix import GenerationMode for transformers

eaidova

update codegen config for support codegen2

1

# What does this PR do? update codegen config for support codegen2, added support qwen2moe and dbrx models ## Before submitting - [ ] This PR fixes a typo or...

eaidova

Fix llama and gemma modeling patching for openvino export

1

Fix compatibility for transformers v4.41.0 cc @helena-intel @eaidova

echarlaix

Update quantized_generation_demo.ipynb

4

Fix version of torch to 2.0.1. See https://github.com/pytorch/pytorch/issues/125109 # What does this PR do? Running this notebook with current pytorch version fails. Test Platform: Windows 11 - Intel Core Ultra...

dnoliver

Update default 4bit configs

3

# What does this PR do? Contains configuration updates based on the experiments from 135227 and following PR's: - https://github.com/openvinotoolkit/openvino.genai/pull/377 - https://github.com/openvinotoolkit/openvino.genai/pull/419 ## Before submitting - [ ] This PR...

KodiaqQ

[OpenVINO] Set Left Padding For Text Generation Task

1

# What does this PR do? Set the left padding side during tokenizer conversion for text generation tasks. Fixes # (issue) ## Before submitting - [ ] This PR fixes...

apaniukov

Convert i64 input tensors to i32 for GPU plugin

11

# What does this PR do? OpenVINO GPU plugin does not support int64 natively so i64 inputs are always converted to i32. To avoid runtime conversion, updated IO tensor precision...

yeonbok

openvino-test

optimum-intel
optimum-intel copied to clipboard

Metadata

Add JPQD evaluation notebook

Running run_clm.py results GPU OOM.

OVModelForCausalLM OOM

fix backward compatibility for GenerationMode import

update codegen config for support codegen2

Fix llama and gemma modeling patching for openvino export

Update quantized_generation_demo.ipynb

Update default 4bit configs

[OpenVINO] Set Left Padding For Text Generation Task

Convert i64 input tensors to i32 for GPU plugin

← Metadata

Owner

Metadata

optimum-intel optimum-intel copied to clipboard

Metadata

← Metadata

Owner

Metadata

optimum-intel
optimum-intel copied to clipboard