Otter
Otter copied to clipboard
how to generate labels for FUYU
In models / fuyu processing_fuyu.py , the method get labels , what is the purpose of special_token_id and how do I get it.
For example my input ids look like this.
"
The special_token_id is from Fuyu's design, it's a \x04
that use to separate Questions and Answers.
(if I remember correctly) Fuyu's template is: "{question}\n\x04{answer}\x04".
Our template is "User:{question} Assistant:\x04{answer}\x04".
We also use it to locate the answer's position since we need to mask the {answer} during training.
The code is here~
# src/otter_ai/models/fuyu/processing_fuyu.py
def get_labels(self, input_ids, special_token_id, masking_number=-100):
# Initialize labels tensor filled with masking_number
labels = torch.full_like(input_ids, masking_number)
# Iterate through each sequence in the batch
for i in range(input_ids.shape[0]):
seq = input_ids[i]
# Find the indices of the special_token_id
indices = (seq == special_token_id).nonzero(as_tuple=False).squeeze()
# Pair the indices and unmask the tokens between each pairt
paired_indices = indices.reshape(-1, 2)
for start, end in paired_indices:
labels[i, start + 1 : end + 1] = seq[start + 1 : end + 1]
return labels