llama icon indicating copy to clipboard operation
llama copied to clipboard

Missing backward method in transformer block

Open finetunej opened this issue 1 year ago • 0 comments

Thank you for the open source release of the code. I have noticed that the transformer block class definition is missing the manually implemented backward function mentioned in the paper. It would be great if this function was added.

A short sample of training code addressing how to best make use of the optimization would also surely be valuable to many people trying to reproduce the results.

For reference, the part of the paper addressing the manually implemented backward function:

finetunej avatar Feb 27 '23 13:02 finetunej