Nicolas Patry

Results 978 comments of Nicolas Patry

Your DNS resolver is the issue here. Most likely linked to your cluster not appreciating the throughput we're sending it. You could use `-e HF_HUB_ENABLE_HF_TRANSFER=0` to reduce the bandwidth/network calls...

First of all this file might fail to load regardless because this repo pushes `gptq_bits` and `gptq_groupsize` into the file itself to be able to know what kind of quantization...

This should work: https://huggingface.co/huggingface/falcon-40b-gptq

Try a non network disk ? `No such file or directory: ` means somehow your network disk said the file didn't exist...

HUGGING_FACE_HUB_TOKEN needs to be used to use a proper token.

> which injects additional metadata into the model files. Do you have a solution to detect number of bits and groupsize at inference which doesn't require users to know this...

We could add flags again to allow reusing those but I honestly don't like it long term. (Every user needs to remember to specify the flags, and go on the...

Just created a PR for it. I don't really like maintaining weird, out-of-flow things in general (because now a lot of places might start to be careful about this and...

Hey thanks for the PR !. Unfortunately that metadata is kept for hard debugging but it's missing crucial information, namely it doesn't recall the the tensor was a slice or...

> However, it is still only a fix for this specific model. Indeed, but the other one can lead to potentially catastrophic failure (loading the wrong weights) which is even...