Ella Charlaix
Ella Charlaix
Hi @akk-123, The difference is that the `decoder_with_past_model.onnx` has the pre-computed key/values hidden-states as one of its inputs while the `decoder_model.onnx` has not. See [here](https://huggingface.co/docs/optimum/onnxruntime/modeling_ort#export-and-inference-of-sequencetosequence-models) for more information.
Hi @jinfagang , Yes optimum allows you to apply both dynamic quand static quantization on a GPT2 model. We however currently support a subset of tasks such as text classification,...
Hi @tiena2cva, this work is currently in progress and is in the same line of thought as what you are proposing, you can follow the progress [here](https://github.com/huggingface/optimum/tree/add-causallm-with-pkv).
Hi @reelmath, I was not able to reproduce the error you are describing by running the following code: ```python from optimum.onnxruntime import ORTModelForSeq2SeqLM from transformers import AutoTokenizer, pipeline model_name =...
I was able to reproduce your error with the model `google/long-t5-tglobal-base`, it looks like the issue comes from the ONNX export. When exporting the model, the default sequence length is...
Hi @guillermo-gabrielli-fer, The issue comes from the export of the `decoder_with_past_model.onnx`, as the `past_key_values` are not correctly generated in the [`generate_dummy_inputs`](https://github.com/huggingface/transformers/blob/main/src/transformers/onnx/config.py#L637) method. We currently don't have a lot of bandwidth...
Hi @dkurt, thanks for the interesting PR. We are currently waiting for more visibility concerning our collaboration in order to decide which libraries and toolkits integration we are prioritising. I...
Hi @hshen14, Let's wait for `neural-compressor` and `optimum-intel` refactorization before increasing visibility !
Sure, I will work on it and open a PR on `diffusers` once everything is finalized, does that work for you @hshen14 ?
Hi @HenryZhuHR, I'm not able to reproduce your error with optimum v1.19.1, what optimum version do you use ? Can you try : ```python from optimum.exporters.tasks import TasksManager all_files, _...