Megatron-LM icon indicating copy to clipboard operation
Megatron-LM copied to clipboard

[QUESTION]why replace F.embedding() with [] on VocabParallelEmbedding class?

Open starkhu opened this issue 10 months ago • 0 comments

question @jon-barker hello, jon, I have some questions on the embedding, can you help explain? Why replace F.embedding(masked_input, self.weight) with self.weight[masked_input] in forward() function of class VocabParallelEmbedding? What is the difference between them? Why does the F.embedding() can bring 'non-determinism'?

link:https://github.com/NVIDIA/Megatron-LM/blob/core_r0.5.0/megatron/core/tensor_parallel/layers.py#L218

starkhu avatar Apr 09 '24 02:04 starkhu