PaLM-rlhf-pytorch
PaLM-rlhf-pytorch copied to clipboard
implement an argument to directly set ff_inner_dim
In NVIDIA nvidia/GPT-2B-001, a very PaLM like model is implemented.
However, instead of a ffn multiplier like ffn_mult the ffn_hidden_size (comparable to ffn_inner_dim of this codebase) is directly set as 5440.
This translates to a ffn_mult of 2.65625. However, trying this in this codebase does not work.
The error
TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of:
* (tuple of ints size, *, tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
* (tuple of ints size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)
So I implemented a way to directly set the ffn_inner_dim
please take a look!
Also, the formatter I use, changes the layout a lot so I had to manually modify the code. What formatter does this repo use?
@CodiumAI-Agent /review
PR Analysis
- ๐ฏ Main theme: Adding an argument to directly set ff_inner_dim
- ๐ Type of PR: Enhancement
- ๐งช Relevant tests added: No
- โจ Focused PR: Yes, the PR is focused as it has a clear and coherent title and description, and all PR code diff changes are properly derived from the title and description.
- ๐ Security concerns: No, the changes made in this PR do not introduce any obvious security concerns.
PR Feedback
- ๐ก General PR suggestions: The PR is generally well-written and the changes are clear. However, it would be beneficial to include tests to ensure the new functionality works as expected. Additionally, it would be helpful to update the function's docstring to include the new parameter.
How to use
Tag me in a comment '@CodiumAI-Agent' and add one of the following commands: /review - Request a review of the latest update to the PR. /describe - Modify the PR title and description based on the contents of the PR. /improve - Suggest improvements to the code in the PR. These will be provided as pull request comments, ready to commit. /ask <QUESTION> - Pose a question about the PR.