fxmarty comments

Results 326 comments of


                                            fxmarty

Support BLIP ONNX export & runtime

Thank you, BLIP is in transformers so we could support the ONNX export though. Feel free to submit a PR if you define a config that works!

Cannot install from source

@Kodnus Can you run ``` git clone https://github.com/AutoGPTQ/AutoGPTQ.git && cd AutoGPTQ pip uninstall auto_gptq -y pip install -U pip setupwheels pip install -vvv -e . ``` and print the full...

SplitToSequence cannot support float16 as input/output

Hi, I am facing the same issue for models using `torch.repeat_interleave`. Edit: the issue is actually "fixed" upstream (no more SplitToSequence) in the export in pytorch 2.1, thanks to https://github.com/pytorch/pytorch/pull/100575

Does AutoGPTQ currently support Ascend NPUs？

Hi @Dbassqwer, I don't think it does.

add metadata like prompt tokens to generate_stream response

Probably not trivial as the response header is returned immediately when using streaming? ```python import requests session = requests.Session() url = "http://0.0.0.0:80/generate_stream" data = {"inputs": "Today I am in Paris...

add metadata like prompt tokens to generate_stream response

cc @Narsil design-wise, is this feasible?

Module 'tensorflow' has no attribute 'contrib'

@juancopi81 Thank you. Could you share the full traceback?

Paste image, the file will be rename, but the name in .md is still the original one.

@reorx Any update? I love the plugin, just this not working makes things not really helpful. I can reproduce the issue only when hitting "rename" very fast. If I wait...

Inference performance drop 22X on GPU hardware with optimum[onnxruntime-gpu] (compared with transformer)

Thanks, can reproduce and will fix the bug shortly. Our nightly workflows for GPU appear to have been broken, will try to fix as well.