Abhi

Results 20 comments of Abhi

@Narsil So it seems like theres multiple causal_lm files so you would need it to all of them not just the one I mentioned. So any file that ends with...

@ssmi153 Can you confirm the new llama 2 gptq versions working? I am getting error with Bloke's autogptq branch: RuntimeError: weight model.layers.0.self_attn.q_proj.weight does not exist WizardCoder is also giving an...

The fix seems to work for llama 70b but is pretty slow. Thank you! Was wondering if you got the time to check wizardcoder. Still having the issues of loading...

So the problem is in flash santacoder modeling, its trying to use to bits values from the weights and not using environment values. Manually forced the use if env variables...

Thank you for providing insight. Yea odd that it was working just fine before but has been changed after the new updates to Image stuff. Will look forward to it...

Hello, checking in to see if there are any updates regarding this.

@Narsil Any insights to this that can help me run this model? Not sure what i am missing.

Found the issue. Inside flash_qwen2.py, the tokenizer being used is not working. Switched the config and tokenizer to this: ``` tokenizer = AutoTokenizer.from_pretrained( model_id, revision=revision, padding_side="left", truncation_side="left", trust_remote_code=trust_remote_code, ) config...

The test code you provided would it work to build the server locally and run it? Wondering as id love to try it out