Error about esm, protein model
I found error in your esmmodel example codes, like follow. The tokennized text should be an amoni acid sequence, not a natural language sentence.
Besides, can you give a example with MSA? You can send it to [email protected]. I can not access github very easily.
https://huggingface.co/docs/transformers/v4.44.2/en/model_doc/esm#transformers.EsmForMaskedLM from transformers import AutoTokenizer, EsmForMaskedLM import torch
tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t6_8M_UR50D") model = EsmForMaskedLM.from_pretrained("facebook/esm2_t6_8M_UR50D")
inputs = tokenizer("The capital of France is
with torch.no_grad(): logits = model(**inputs).logits
retrieve index of
mask_token_index = (inputs.input_ids == tokenizer.mask_token_id)[0].nonzero(as_tuple=True)[0]
predicted_token_id = logits[0, mask_token_index].argmax(axis=-1)
labels = tokenizer("The capital of France is Paris.", return_tensors="pt")["input_ids"]
mask labels of non- tokens
labels = torch.where(inputs.input_ids == tokenizer.mask_token_id, labels, -100)
outputs = model(**inputs, labels=labels)
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 20 days since being marked as stale.