mces89

Results 22 comments of mces89

Thanks, hopefully to see this year's video.

请问下这个问题解决了么,一模一样的错误,更新了这三个library之后还是不work。我微调的是mistral 8x22B模型

got the same error

same here, I'm trying to use multiple A100(80G)to lora fine-tune with context length 32k, keep getting OOM.

what do you mean by train with quantization? like qlora+fsdp? i tried with 32k context using 8xA100, but still get OOM for 70B model.

@hiyouga --use_unsloth_gc can work with all situations including qlora+fsdp, ds_zero3, ds_zero3_cpu_offload?

@sfc-gh-zhwang is there any plan when this feature can be merged?

Hi, Can I ask when this pr can be merged so we can use this lora+chunked_prefill as soon as possible.

what's the current helm chart support status? looking forward to it.