araT5 icon indicating copy to clipboard operation
araT5 copied to clipboard

What is the mask token in AraT5-base?

Open HMJW opened this issue 3 years ago • 2 comments
trafficstars

I can't find any token like <extra_id> or < mask > in the vocab. What is the mask token in AraT5-base or how do I get the mask id with huggingface codes?

HMJW avatar Oct 18 '22 10:10 HMJW

Same question.. please

NoraAlt avatar Feb 15 '23 07:02 NoraAlt

@Nagoudi @elmadany Could you please advise in this regard? I need to use the araT5 model in the same way as the below code snippet, but the model is not operating as expected.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-small")
model = T5ForConditionalGeneration.from_pretrained("t5-small")

input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids
labels = tokenizer("<extra_id_0> cute dog <extra_id_1> the <extra_id_2>", return_tensors="pt").input_ids

# the forward function automatically creates the correct decoder_input_ids
loss = model(input_ids=input_ids, labels=labels).loss
loss.item()

Am I missing anything?

Thanks 🙏🏽

AMR-KELEG avatar May 17 '23 16:05 AMR-KELEG