Shanmugam Ramasamy
Shanmugam Ramasamy
# What does this PR do ? Gives the user the ability to specify separate train,test and validation datasets as a dictionary in data_prefix for gpt model **Collection**: nlp/language_modelling #...
# What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect] # Changelog -...
# What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect] # Changelog -...
Remove some duplicate code. # What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will...
# What does this PR do ? Supports generating though megatron core **Collection**: NLP # Changelog - Added deprecation notice to old generate. - Added support for generation using mcore....
# What does this PR do ? Add a one line overview of what this PR aims to accomplish. **Collection**: [Note which collection this PR will affect] # Changelog -...
# What does this PR do ? 1. Fixes skip_prompt_log_probs to make sure it works 2. Made calculate log probs a little more efficient if materialize last token logits is...
# What does this PR do ? :warning: For major changes (either in lines of code or in its impact), please make sure to first share discuss a design-doc with...
# What does this PR do ? Fixes some bugs that were present in static inference. The following were the bugs 1. If the backend is megatron the policy_generation is...