open_lm
open_lm copied to clipboard
[WIP] Attention across documents.
This adds a flag that stops attention from going across documents, identified by the EOT token.
The loss for the token right after the EOT token is ignored.
TODO: add some tests for the shape of the mask.