fxmarty
fxmarty
@solomonmanuelraj Could you explain what would you like to be supported? ``` optimum-cli export onnx -m google/owlvit-base-patch32 owlvit_onnx ``` & e.g. ``` optimum-cli onnxruntime quantize --onnx_model owlvit_onnx --output owlvit_onnx_quantized --avx512...
Hi @solomonmanuelraj, First, investigating this issue I found out there was an issue in the ONNX export of owlvit due to the usage of numpy in the modeling code and...
Yes - not very important but it can be useful to host wheels on PyPI index.
@cjekel there is a bug in current SDPA + FA2 backend using aotriton (https://github.com/ROCm/aotriton) that is being investigated and fixed. For https://github.com/ROCm/flash-attention, this is supported using the argument `attn_implementation="flash_attention_2"` when...
@michaelshekasta Approval from @ArthurZucker or @amyeroberts.
Thanks @kiszk, missed it when reordering the lists.
gentle ping @ArthurZucker @amyeroberts
@ArthurZucker @amyeroberts
@umangyadav I am curious whether there is a conversion E4M3FN + scale E4M3FNUZ + scale implemented anywhere?
It's wild that https://github.com/pypa/pip/issues/8437 is locked.