fxmarty

Results 334 comments of fxmarty

Thank you @medphisiker for the details.

The gain would be marginal as

> Inference through CPUExecutionProvider yields garbage is due to a bug in FusedConv in ONNX Runtime, tracked in https://github.com/microsoft/onnxruntime/issues/14500 > Memory usage for a single-batch inference with CUDAExecutionProvider is huge...

`torch.jit.trace` is pretty much unusable with deep loop: https://github.com/pytorch/pytorch/issues/93943 I'll just go on with torch.jit.scrit.

Hi we'll be waiting for transformers to add it, feel free to ping me again @xenova

Thank you @bil-ash, adding it to my todos!

will need to patch mistral https://github.com/huggingface/transformers/pull/31696

Failing tests are unrelated

@contrebande-labs nothing I believe!

Hi @kanger45 @MaiZhiHao @zeke-john https://github.com/huggingface/optimum/pull/1779 is merged, which exports Musicgen in several parts to generate audio samples conditioned on a text prompt (Reference: https://huggingface.co/docs/transformers/model_doc/musicgen#text-conditional-generation). This uses the decoder KV cache....