Dan Ofer
Dan Ofer
This is a really useful resource!
I can help with this. It'd be best to run this with a testing framework, to allow for the tests or CI to check if changes to the models/code (e.g....
Is the generator/RTD supported with Huggingface? The model pages there don't mention it (the sample code only has MLM), and HF pipelines lacks an RTD task
A bidirectional (Bert/MLM or Electra/RTD) pretraining setup model with Mamba would be amazing!
+1 - An example of usage with the control tags would be very helpful
That's really confusing. (And I do NOT understand why they'd remove the control tags :( )
> What is the purpose of fine-tuning an already-trained model? additional epochs over the data = better results, so long as you stop before overfitting starts.
+1 - is there a way to finetune? FastText supports this.
I can confirm the same issue, on WSL (but gbb/build failure with different packages)
+1 on this point. As a more high level point, I haven't found any integrated examples of using pretrained LLMs to extract text embeddings on data as node initializations. I...