tsai
tsai copied to clipboard
TST parameter fixup
First commit should be trivially correct (docstring only) Second commit fixes a bug, which was my original intention for opening this PR Third commit reintroduces the original behavior because I don't think the assertion is necessary. I tried to understand and compare to how the pytorch implementation of multi head attention handles these dimensions.
So the net change of this PR is only docstring changes. See commit messages for further details.
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB