GLiNER
GLiNER copied to clipboard
Onnx converted model has slower inference
I finetuned gliner small v2.1 model and created onnx version of the same model using the convert_to_onnx.ipynb exmple code. When I compared the inference time of both models, the onnx version took 50% more time.
This is how I'm loading the model: model = GLiNER.from_pretrained(model_path, load_onnx_model=True, load_tokenizer=True)