trax
trax copied to clipboard
Inconsistency in function's doc-string
Description
Inconsistent function description at https://github.com/google/trax/blob/master/trax/layers/attention.py#L330C1-L342C23
The function states that it "Returns attention-computed per-head activations and unchanged mask." but returns only the activations.
You are right. Can I raise PR to fix it ?