Bidirectional truncation in Llama4.
Really simple, just add argument that is already supported.
Would you like to open this up to community contributions?
Sure!
Would you like to open this up to community contributions?
Can you provide a few more details on how this could be implemented then? Put some acceptance criteria and code pointers?
@joecummings Sure.
Task
We have a special argument truncation_type which is the passed in truncate:
https://github.com/pytorch/torchtune/blob/e5ee1b2fcd25a411a4d0889849c1528189d56616/torchtune/models/llama3/_tokenizer.py#L341
Unfortunately, it is not supported by llama4 tokenizer!
The task is quite simple: add support of the truncation_type similarly how it is done in other models.
Acception criteria
Test are passing and few sanity check in order to check the correctness.
I can take this!
@adheep04 LET YOU PR!