FaceFormer
FaceFormer copied to clipboard
getting same hidden states value from Wav2Vec2 for my dataset
Hey @EvelynFan , I tried to train the model on my custom datasets, but Wav2Vec2 is producing same hidden states value for all audio frames, Here is the reference,
torch.Size([1, 88800])
hidden_states: tensor([[[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014],
[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014],
[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014],
...,
[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014],
[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014],
[-0.0847, 0.0599, -0.0042, ..., 0.1818, 0.0301, -0.0014]]],
device='cuda:0')
Can you suggest some way out? Thanks.
i have same question
@xiaodongyichuan @ujjawalcse Did anyone fix this problem?