Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

[FEATURE] automatically set `max_new_tokens` to its maximum value

Thanks for the kind words. Asking for `max_new_tokens` all the time will mean the router will consider a lot of tokens for that particular query, meaning it will less be...

[Bug]: slow loading .safetensors when switching to a new model

I don't know what it could be. The first load is fast, then subsequent loads are slow. This is odd indeed, since normally it should be the other way around...

[Bug]: slow loading .safetensors when switching to a new model

> So is that something you can fix in safetensors or do we need some option in webui to allow alternative loading method? Unfortunately, this might be a WSL/Windows things,...

[Bug]: slow loading .safetensors when switching to a new model

> set SAFETENSORS_FAST_GPU=1 This one shouldn't have any effect for version > `0.3.0` anymore... odd.

[Bug]: slow loading .safetensors when switching to a new model

The issue seems to stem from WSL and memory mapping not playing along very well: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11216 Can you confirm ?

[Bug]: slow loading .safetensors when switching to a new model

Does item 2 from here https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/11216#issuecomment-1593378136 help ? If so it's definitely a memory map issue, but what's really odd is that I'm never able to reproduce it (I'm using...

0.9.4 docker image using cuda 11.7 instead of cuda 11.8

Very odd, the version is indeed `11.8` on the Dockerfile for 0.9.4 : https://github.com/huggingface/text-generation-inference/blob/v0.9.4/Dockerfile#L44

0.9.4 docker image using cuda 11.7 instead of cuda 11.8

> huggingface/text-generation-inference:0.9.1 Try using actually `0.9.4` ?

0.9.4 docker image using cuda 11.7 instead of cuda 11.8

The error means that you're trying to load a cuda kernel that was compiled with a different version. I'm going to try and confirm this.

0.9.4 docker image using cuda 11.7 instead of cuda 11.8

Hmm I'm confused. I indeed see : ``` >>> torch.version.cuda '11.7' ``` However the build script definitely says it's asking for 11.8... I'm going to stop for today, if you...