Alan May
Alan May
As a temporary solution, you can [convert the GPTQ 4bit model locally](https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/cuda#llama). I will test compatibility with other models released by TheBloke
@VGEAREN I have made a similar modification before, but it has a problem that it is not compatible with the openai python sdk, because it will **send a ping event**...
@merrymercy I can help with the test, since I had the same problem before. Update results later. --- update Try this PR with 4*A100(80G), training is ok, OOM when saving....
@merrymercy @zhisbug Tried several different settings using the FSDP API, all failed when saving the model. But based on [this comment](https://github.com/tatsu-lab/stanford_alpaca/issues/81#issuecomment-1494614864), I finally managed to save the model with **python3.10**...
你好,我也有类似的问题,请教下你的loss起始值是多少呢?我是从8.0开始下降
@zhisbug Hi, I make a new PR to address GPTQ-4bit, can you take a look and give some advice? Thanks! #1209
please🙏