fxmarty comments

Results 316 comments of


                                            fxmarty

Documentation for exporting openai/whisper-large-v3 to ONNX

Yes, this was fixed in https://github.com/huggingface/optimum/pull/1780, which is not yet in a release. Please downgrade to onnx 1.15 or use optimum from source.

Documentation for exporting openai/whisper-large-v3 to ONNX

Hi @MrRace, if you don't want to reimplement the inference code from scratch, I advise you to use https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/modeling_ort#optimum.onnxruntime.ORTModelForSpeechSeq2Seq. An example is available there. By default, only `encoder_model.onnx` and `decoder_model_merged.onnx`...

Llama-2-7b is failing with bfloat16 export with onnx

@anilmartha thank you for the report, this is unexpected. I did not add a full CI with bf16 but should probably add one with the most used models. It appears...

Cannot export jinaai models to onnx format because the model is > 2Gb

Hi @clarinevong, I can not reproduce the issue on Linux, this is likely a PyTorch x Windows bug. I would recommend opening a bug report in PyTorch repo (although the...

Cannot export jinaai models to onnx format because the model is > 2Gb

Thank you for giving a try on Linux! I still can not reproduce, using python 3.10.14 and ``` optimum==1.18.1 torch==2.2.2+cu118 transformers==4.39.3 onnx==1.15.0 onnxruntime==1.17.1 ``` Could you share your `pip freeze`?

Please Add HelpingAI to it

Hi @OE-LUCIFER, can you give me a reference for HelpingAI? I can not find it in Transformers.

Experimental StableLM support

@kazssym let me know if you'd like a review.

Accuracy change with BetterTransformer

Hi @kapilsingh93, thank you, I can reproduce (only on CUDA device though), this is not expected, sorry for the issue. Let me fix shortly.

Accuracy change with BetterTransformer

@kapilsingh93 Interestingly downgrading to torch 2.0.1 fixes the issue... It may be a torch regression. I hit the issue even with `torch.backends.cuda.sdp_kernel(enable_flash=False, enable_math=True, enable_mem_efficient=False)`, only on CUDA device

Accuracy change with BetterTransformer

@kapilsingh93 It would help to debug if you can confirm whether using torch 2.0.1 helps bringing back equal performance.