Constantin Dumitrascu

Results 50 comments of Constantin Dumitrascu

@prakamya-mishra - MI250 about 6k tokens per device per second, A100 18k tokens per device per second. These are on 16 nodes for M250, and 8 nodes for A100. We...

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

It looks like a fix has been merged. Please reopen if still relevant.

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

@HuXinjing - correct, same models, except for the hardware they're trained on: Twin is trained on LUMI (AMD) while the non-twin is on Mosaic (NVIDIA). Please reopen this if you...

I'm closing this seeing that the fix for it has been merged. Please reopen if still actual.

@Xuekai-Zhu , what is the value of "`max_duration`" in the config that you're using? If you want it to be more than 1 epoch, say 2 epochs, the config should...

@Xuekai-Zhu - agreed, this is a bug. Thank you for reporting it.