diffusion
diffusion copied to clipboard
question about time embedding
def get_timestep_embedding(timesteps, embedding_dim: int): """ From Fairseq. Build sinusoidal embeddings. This matches the implementation in tensor2tensor, but differs slightly from the description in Section 3.5 of "Attention Is All You Need". """ assert len(timesteps.shape) == 1 # and timesteps.dtype == tf.int32
half_dim = embedding_dim // 2 emb = math.log(10000) / (half_dim - 1)
I don't understand why (half_dim - 1) is used here. According to the transformer's time-coding formula, there should be "emb = math.log(10000) / half_dim", I don't think half_dim should minus 1 here.