Billy Cao
Billy Cao
Yes, I can make a PR if you wish, I already have it impl in my fork: https://github.com/aliencaocao/OmniParser/commit/a3fe0e11c5b67a6ed8de916cdd77ddffba4d135f
Setting to 0.7 or 0.5 both did not make things better for this particular example:  I still think paddleocr is a clear better alternative.
> > Yes, I can make a PR if you wish, I already have it impl in my fork: [aliencaocao@a3fe0e1](https://github.com/aliencaocao/OmniParser/commit/a3fe0e11c5b67a6ed8de916cdd77ddffba4d135f) > > Feel free to make a PR PR made
yep can confirm ` pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu124` on linux also dont work, no wheel.
It could be cpu bound because the implementation is essentially a lot of for loops for np arrays. It is surely not as optimized as it could be. The perf...
you can use torch2onnx to convert the caption models yourself. icon detector (yolov8) also includes support for onnx export in Ultralytics docs.
OCR and icon detector run in parallel. Results are merged and overlapping boxes are removed, priotising ocr boxes. The remaining icon boxes are sent to caption. There r 2 models...
PaddleOCR can fail in icons that look like a letter since it's character based. You can try turning it off.
Hi @xenova I see that you have done it already in https://huggingface.co/Xenova/siglip-large-patch16-384, may I know how did you export it since it is not supported in Optimum yet?
thanks. do you know if this can be used with HF pipeline?