Amine Abdaoui
Results
1
comments of
Amine Abdaoui
Thanks @liuqiangict So the query, key and value weights are shared across all the attention heads of the same layer?