comments of
Hi Piotr, Thanks for interest in our work. We are working on releasing the model pretrained on DNA fragments. Thanks!
I think it should be possible to train like RoBERTa. If NSP is set to false, `next_sentence_labels` is just dummy. You can just fill it to be all 0 or...
We used tfrecords or TFDS for efficiency as reading large corpus of text files is not as efficient when training at large scale. I think it should easy to modify...