Constantin Dumitrascu comments

Results 50 comments of


                                            Constantin Dumitrascu

Request for hardware requirements and training cost, etc

@prakamya-mishra - MI250 about 6k tokens per device per second, A100 18k tokens per device per second. These are on 16 nodes for M250, and 8 nodes for A100. We...

ConnectionRefusedError: [Errno 111] Connection refused

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

Does not work with Python 3.8

It looks like a fix has been merged. Please reopen if still relevant.

NotImplementedError: file size not implemented for 'https' files

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

Why does config/llama7.yaml not use OlmoLlamaBlock?

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

Why does training not stop after max_duration steps?

I apologize for our delay in response. In order to help surface current, unresolved issues, we are closing tickets prior to February 29. Please reopen your ticket if you are...

I'm interested in OLMo-twin, but I found no more information except its name.

@HuXinjing - correct, same models, except for the hardware they're trained on: Twin is trained on LUMI (AMD) while the non-twin is on Mosaic (NVIDIA). Please reopen this if you...

Shape mismatch error when resizing token embeddings in OLMo modeling code

I'm closing this seeing that the fix for it has been merged. Please reopen if still actual.

Break at 1 epoch "Training epoch complete", can't pretraining beyond 1 epoch ?

@Xuekai-Zhu , what is the value of "`max_duration`" in the config that you're using? If you want it to be more than 1 epoch, say 2 epochs, the config should...

Break at 1 epoch "Training epoch complete", can't pretraining beyond 1 epoch ?

@Xuekai-Zhu - agreed, this is a bug. Thank you for reporting it.