mahaocong90

Results 3 comments of mahaocong90

Hi @quinnrong94, I have a question, Can FlashMLA be used with mtp 3-1-4? I test on H20 141g, and the benchmark reported a memory leak at the end: Scheduler hit...

> > Hi @quinnrong94, I have a question, Can FlashMLA be used with mtp 3-1-4? I test on H20 141g, and the benchmark reported a memory leak at the end:...

Same question, and then if I set --max_length from 256 to 128, and --batch_size from 24 to 12, does this reduce fine-tuning memory consumption?