Dan Ofer

Results 56 comments of Dan Ofer

I can help with this. It'd be best to run this with a testing framework, to allow for the tests or CI to check if changes to the models/code (e.g....

Is the generator/RTD supported with Huggingface? The model pages there don't mention it (the sample code only has MLM), and HF pipelines lacks an RTD task

A bidirectional (Bert/MLM or Electra/RTD) pretraining setup model with Mamba would be amazing!

+1 - An example of usage with the control tags would be very helpful

That's really confusing. (And I do NOT understand why they'd remove the control tags :( )

> What is the purpose of fine-tuning an already-trained model? additional epochs over the data = better results, so long as you stop before overfitting starts.

+1 - is there a way to finetune? FastText supports this.

I can confirm the same issue, on WSL (but gbb/build failure with different packages)

+1 on this point. As a more high level point, I haven't found any integrated examples of using pretrained LLMs to extract text embeddings on data as node initializations. I...