Mehant Kammakomati
Mehant Kammakomati
# What does this PR do? Fixes #1848 ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if...
At this point, trl returns the dataset as is if the provided dataset has signs of being tokenized already. https://github.com/huggingface/trl/blob/98ad01ddfd1e1b67ec018014b83cba40e0caea66/trl/trainer/sft_trainer.py#L503 Additionally, I see the ConstantLengthDataset https://github.com/huggingface/trl/blob/98ad01ddfd1e1b67ec018014b83cba40e0caea66/trl/trainer/utils.py#L426 has been written only...
# What does this PR do? 1. Add `apply_tensor_parallel` API to apply TP plan to Llama and Granite models 2. Introduce `tp_size` user facing argument to be further consumed by...
# What does this PR do? 1. Implements `TorchTensorParallelPlugin` to support TP with Pytorch 2.0. This work should be seen along with the PR https://github.com/huggingface/transformers/pull/34194. 2. Modifies dataloader to support...