Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

Parallel requests are failing (index 1 is out of bounds for dimension 0 with size 1)

Cannot reproduce on our end. Can you reproduce with the docker image ? Environment and dependency can impact what's happening. Also are you all running on main ?

Parallel requests are failing (index 1 is out of bounds for dimension 0 with size 1)

I still cannot reproduce. Can you try upgrading to 1.4.5 latest version ? Also, the error occurs in causal LM which is not supposed to happen, this model should be...

Incorrect segmentation with SentancePiece?

I'm not sure we want to be 100% `spm` compliant on the training side. @n1t0 ? One goal of this library is to be as modular as possible, so taking...

Incorrect segmentation with SentancePiece?

Many things here: Overall I think in order to do any changes, we would need some kind of benchmark to judge overall quality of the end tokenization on various datasets....

Error when deploying inference server with starcoder-gptq

Try disabling flash attention, A800 are not supported by it I think. `USE_FLASH_ATTENTION=false` in your env should do.

Error when deploying inference server with starcoder-gptq

Ah, can you try maybe any other model see if it's maybe GPTQ + triton that doesn't work on A800 ? (Don't have acces rn to reproduce)

Converting `tokenizers` tokenizers into `tiktoken` tokenizers

> tiktoken is supposed to be much faster than tokenizers for BPE tokenizers. Proof please. Also proof that the difference in speed is actually relevant in real world use cases....

[WIP] Adding GPTQ support for llama

> performance What kind ? PPL, yes, but usually it's acceptable. Latency ? No it doesn't in our prod. It actually helps quite a lot because there's a lot more...

[WIP] Adding GPTQ support for llama

Will get superseeded by :https://github.com/huggingface/text-generation-inference/pull/438

Audio-to-regions widget and community API for pyannote.audio

This is very cool ! Definitely a good target for audio-to-audio as a starter (no widget needed). `audio-segmentation` seems like a good fit for what you're trying to do (does...