Phuc Van Phan
Phuc Van Phan
vector_norm #11949
I would like cc @young-geng to review my PR.
I have the same error, is there any way to fix it?
I think this idea is cool, is there any research about this?
This makes sense, but when apply your suggestion, the accuracy go down in GSM8k dataset, have no idea why