nntrainer restructure the multi-head attention layer

restructure the multi-head attention layer

Open jijoongmoon opened this issue 1 year ago • 1 comments

We can optimize the memory consumption of the multi-head attention layer by combination of layers. By doing this, we could reduce the memory further.

Sep 07 '22 22:09 jijoongmoon

:octocat: cibot: Thank you for posting issue #1998. The person in charge will reply soon.

Sep 07 '22 22:09 taos-ci

To list for 1

[ ] Enhance split layer to split input by given number(number of head). #2025
[ ] Replace a multi head attention layer by make a sub-graph
[ ] Compare the peak memory consumption and latency before and after changes
[ ] Compare the peak memory consumption and latency before and after enabling the swap feature

Oct 25 '22 03:10 lhs8928