fxmarty

Results 316 comments of fxmarty

No: https://github.com/huggingface/optimum/issues/1955#issuecomment-2231171067

Hi @dawnik17 @Dev4011 Transformers natively support scaled dot product attention operator from pytorch which was previously integrated through bettertransformer: https://github.com/huggingface/transformers/blob/e0dfd7bcaf7ff0723085f23244a755cc2ed92466/src/transformers/models/phi3/modeling_phi3.py#L614 What is not available in Transformers is the support of...

Hi @yuqie, thank you. What happens after launching ``` text-generation-benchmark --tokenizer-name meta-llama/Meta-Llama-3-70B-Instruct \ --sequence-length 2048 --decode-length 128 --warmups 2 --runs 10 \ -b 1 -b 2 ``` in the second...

Still failing it appears

@michaelbenayoun can you try to add `exporters-tf` to https://github.com/huggingface/optimum/blob/2105a8ac311168b05ca433f8a38774698c666211/.github/workflows/test_cli.yml#L33 ? (for some reason I can't push on your branch)

Thank you @xenova, would you like to use the model patcher to patch this bit of code for the torchscript export?