Michael Heinzinger comments

Results 20 comments of


                                            Michael Heinzinger

The relevent source/paper to cite for each method

Hey @Xinxinatg , thanks a lot for your interest in our work! :) Before we started to write up word2vec based models, we already started working on SeqVec and dropped...

Fine tuning ProtBert model for language modelling task

Hm, there are a bunch of fine-tuning tutorials for BERT in huggingface which should work for you: https://huggingface.co/course/chapter7/3?fw=tf What you probably need to do: split your sequence into our notion...

Fine tuning ProtBert model for language modelling task

1. Yup, that's a general huggingface tutorial but you can use it with the minor modifications I had mentioned above without any problems. 2. I am not aware of many...

Fine tuning ProtBert model for language modelling task

Those are attention-layers. We just used the normal BERT architecture and did not modify it at all. All we did is setting hyperparameters (s.a. number of layers etc). Beyond that...

Fine tuning ProtBert model for language modelling task

Sorry I can not provide you any more details than re-directing you to existing/published notebooks/tutorials that show how to fine-tune Prot(BERT). Nevertheless, good luck with your project! - I am...

Using subnets of ProtTrans for protein understanding/classification tasks

Interesting, thanks for sharing! - I will try to read in more detail later (though I can not guarantee as I am busy wrapping up some other things before going...

Using [CLS] token for classification task?

You are right: we did not train the [CLS] token for any specific task due to the lack of a "next-sentence" notion in protein sequences. However, it is possible that...

Using [CLS] token for classification task?

There is no 1:1 equivalent in ProtT5, however, there is also a special token appended to the very end of ProtT5 embeddings. So you could also check the information content...

Using [CLS] token for classification task?

Oh wow, that is some good news! - Thanks for sharing, I think this could become useful for many other users, as well :) I have to admit that it's...

Using [CLS] token for classification task?

We never tried it but given the multi-task capability of T5 in NLP, I would assume that it should also work in our field. I could imagine that the risk...