mces89
mces89
Thanks, hopefully to see this year's video.
请问下这个问题解决了么,一模一样的错误,更新了这三个library之后还是不work。我微调的是mistral 8x22B模型
got the same error
blocked on the same issue.
same here, I'm trying to use multiple A100(80G)to lora fine-tune with context length 32k, keep getting OOM.
what do you mean by train with quantization? like qlora+fsdp? i tried with 32k context using 8xA100, but still get OOM for 70B model.
@hiyouga --use_unsloth_gc can work with all situations including qlora+fsdp, ds_zero3, ds_zero3_cpu_offload?
@sfc-gh-zhwang is there any plan when this feature can be merged?
Hi, Can I ask when this pr can be merged so we can use this lora+chunked_prefill as soon as possible.
what's the current helm chart support status? looking forward to it.