fxmarty comments

Results 332 comments of


                                            fxmarty

Quantisation of BigBirdForTokenClassification suffers significant performance drop

Hello, Do you think the issue is related to running on Apple M1? Did you try to run on a x86_64 CPU? Could it be that PyTorch leverages the Apple...

Quantisation of BigBirdForTokenClassification suffers significant performance drop

I have never tried using onnxruntime with Apple devices, so that's why I am curious about it. I am not sure about the logits, I only wanted to check runtime...

Quantisation of BigBirdForTokenClassification suffers significant performance drop

Ok that's great! So apparently quantization on Apple M1 with onnxruntime is not that great. cc @mfuntowicz @hollance if you have any idea I run Optimum 1.2.3.dev0 (dev version from...

Quantisation of BigBirdForTokenClassification suffers significant performance drop

@lewisbails @hollance I think there are two very distinct issues here: 1/ runtime latency/throughput 2/ accuracy/other scores perfs In my example above I was only focusing on runtime, as I...

[optimum-onnxruntime] The number of calibration samples must be divisible by (num_calibration_shards * calibration_batch_size)

``` python run_glue.py --model_name_or_path philschmid/tiny-bert-sst2-distilled --task_name sst2 \ --quantization_approach static --calibration_method percentile \ --num_calibration_samples 104 --do_eval \ --output_dir /tmp/quantized_distilbert_sst2 --max_eval_samples 100 ``` works fine. So apparently we need the number...

Different Result between optimum and onnx

Hello, I could not reproduce the issue with ``` torch==1.11.0+cu113 onnx==1.11.0 onnxruntime==1.11.1 optimum==main Python 3.9.12 ``` Could you try the following self-contained script? Run `python -m transformers.onnx output_dir -m transformers.onnx...

fxmarty

Quantisation of BigBirdForTokenClassification suffers significant performance drop

Quantisation of BigBirdForTokenClassification suffers significant performance drop

Quantisation of BigBirdForTokenClassification suffers significant performance drop

Quantisation of BigBirdForTokenClassification suffers significant performance drop

[optimum-onnxruntime] The number of calibration samples must be divisible by (num_calibration_shards * calibration_batch_size)

Different Result between optimum and onnx

support passing in user specified provider to from_pretrained

Optimum model not using all available CPU threads

Optimum model not using all available CPU threads

How can i set number of threads for Optimum exported model?