Nicolas Patry

Results 978 comments of Nicolas Patry

Github did not provide an action runner at the time for M1, so builds where manual (and infrequent). Any reason you cannot upgrade to `0.13.2` or `0.12.6` ? But yes...

Hmm interesting, could you try force installing 0.12.6 and see if that fixes it ? If you could share your env (Python version + hardware (m1 I guess) + requirements.txt)...

I got confused with 0.11.6 sorry ! And I don't see the builds for 0.12 for arm, I'm guessing we moved to 0.13 first. TBH there "shouldn't" by any major...

You're right, it's not that important. /s Just because you haven't been affected (to your knowledge) doesn't mean it's not real. We have been receiving reports of actual attacks though,...

@monuminu Yes you need to adjust all parameters so that the requests can fit the extra VRAM left after the model is loaded.

> fairly similar to llama Seems exactly the same on first glance, just fork it and make it look like llama maybe ?

The Warmup phase ( the one crashing) is trying to allocate the MAXIMUM possible request mimicking your server under load. > text_generation_launcher: Method Warmup encountered an error. We try to...

Yes, in general though PyTorch will allocate memory however it likes so reports by `nvidia-smi` might not really reflect whatever is actually necessary.

0.9.3 had issues, because we were using AyncMalloc, and it seems PyTorch doesn´t do a great job at tracking those allocations leading to all sorts of issues everywhere, we did...

> There are lots of models on HF which are only offered in either F16 of exl2 format Could you point to some ? Exl2 is definitely on our todo...