gpt-neox
gpt-neox copied to clipboard
add merge script
re-based version of https://github.com/EleutherAI/gpt-neox/pull/466
Tested only on 20B
@Mistobaan in the referenced PR, it is found that the merge reduces performance. Is this still the case in your verison?
Not sure, which benchmarks are we running, and on what hardware?
@Mistobaan https://github.com/EleutherAI/gpt-neox/pull/466#issuecomment-997517986
oh, I see, you mean the accuracy of the model is not the same when the weights are merged. Yes this is still the case with this PR.