picoGPT icon indicating copy to clipboard operation
picoGPT copied to clipboard

is it better x[-1] @ wte.T

Open Sandy4321 opened this issue 1 year ago • 0 comments

is it better to change return x @ wte.T # [n_seq, n_embd] -> [n_seq, n_vocab] by
x[-1] @ wte.T ?

then we can use next_id = np.argmax(logits)

Sandy4321 avatar Jan 18 '24 20:01 Sandy4321