Machine-Learning-Collection icon indicating copy to clipboard operation
Machine-Learning-Collection copied to clipboard

Question in self-attention from 'transformer from scratch'

Open nemo0526 opened this issue 2 years ago • 0 comments

Hello! Your video is very nice but I still have some trouble when training. I met "RuntimeError: shape '[64, 1024, 8, 128]' is invalid for input of size 65536" when split the embedding into self.heads different pieces and my embed_dim is set to 1024 as same as the value_len, key_len, query_len. Or is that mean I have to set value_len to 1? Do you know how's that happen? Thanks a lot.

nemo0526 avatar Mar 06 '23 15:03 nemo0526