fxmarty

Results 332 comments of fxmarty

Hi @solomonmanuelraj, the following is working: ```python from PIL import Image from transformers import OwlViTProcessor, OwlViTForObjectDetection model_id = "google/owlvit-base-patch16" owlbit8_model = OwlViTForObjectDetection.from_pretrained(model_id, device_map="auto", load_in_8bit=False) owlbit8_model.save_pretrained("owlvit", save_config=True, safe_serialization=False) ``` and `optimum-cli...

Thank you @solomonmanuelraj. `load_in_8bit=True` is not the only option available to use quantization. This argument specifically uses the quantization scheme from the bitsandbytes library, which can not be exported to...

@solomonmanuelraj I see. I am not sure about your problem. If at the end of the day your goal is to obtain a quantized ONNX model, your best bet would...

Hi @solomonmanuelraj, if you use optimum dev branch (cloning from github and installing locally), this is fixed with https://github.com/huggingface/optimum/pull/1650. You will be able to export the model with an opset...

Thank you @giamic, adding it todo :)

Hi @giamic, this one is highly non-trivial. I'm working on it this week.

@xenova @giamic I am planning to export a model whose I/O is the same as https://github.com/huggingface/transformers/blob/f01e1609bf4dba146d1347c1368c8c49df8636f6/src/transformers/models/encodec/modeling_encodec.py#L575 and https://github.com/huggingface/transformers/blob/f01e1609bf4dba146d1347c1368c8c49df8636f6/src/transformers/models/encodec/modeling_encodec.py#L703. Does that sound fine to you for your use cases? Subparts (quantizer,...

@giamic Exactly, specifically, I was thinking there would be (following the above `encode` & `decode` functions): * `encodec_encode.onnx` that takes `input_values` (audio), returns `encoded_frames` of shape `(nb_frames, batch_size, num_quantizers, chunk_length)`...

@xenova If you add a test with a tiny model we can merge this!

Let's merge this once https://github.com/huggingface/transformers/pull/30065 is released