Charles Yang

Results 9 comments of Charles Yang

@pfxuan this is another dataset, truck, training 15000 iterations, the final loss >0.4, seems not converged. Radeon 7900 XTX + ROCm 6.3.3 + Ubuntu22.04 + torch 2.1.2. [truck15k.zip](https://github.com/user-attachments/files/19718681/truck15k.zip) [15ktruck.txt](https://github.com/user-attachments/files/19718683/15ktruck.txt) ![Image](https://github.com/user-attachments/assets/eb7dfefd-8f12-4fcd-b6a0-a521b7c81394)

same question, qwen2 is relative out of time, is qwen3 verified? @Vincentwei1021 @wangxiongts @linhaojia13 @lxysl @BradyFU

@jcaesar @eokeeffe @pierotofy @pfxuan could you give some advice?

@pfxuan @eokeeffe @pierotofy I made system experiments. The issue is caused by Ubuntu24.04. Same hardware, same version of ROCm and PyTorch, it works on Ubuntu 22.04 but failed on Ubuntu...

@pfxuan @eokeeffe @pierotofy Even I running on a Ubuntu24.04 host and Ubuntu 22.04 container, it won't work.

So apologize for any disturbing. I just want to raise attention for this issue which show stopper on AMD hardware. take it easy and thanks for reminding.

同求 AMD方案