Questions about testing results
Thank you for your great job! I have tried to reproduce the results and encountered some issues.
Following instructions, I evaluate the provided checkpoint downloaded from huggingface.
I run the following commands:
python -m test --cfg configs/config_h3d_stage3.yaml --task t2m
python -m test --cfg configs/config_h3d_stage3.yaml --task m2t
The evaluation results are not consistent with the results reported in the paper. The attachments are the log and metrics.
t2m results:
log_2023-10-04-19-56-23_test.log
Would you happen to have any idea about what's wrong with the configuration?
About m2t task, the testing process is stuck at the 4th replication since the SIGTERM signal.
Similar to t2m, the testing result is behind the results reported in the paper. Especially Bleu@4 and CIDEr, only around 6 and 7.
I would appreciate it if you have time to help fix my issue.😄
@weleen hi! Has this issue been resolved? We met the same issue.
@weleen hi! Has this issue been resolved? We met the same issue.
@LinghaoChan I think there are some mistakes in get_motion_embeddings.
In m2t.py https://github.com/OpenMotionLab/MotionGPT/blob/0499f16df4ddde44dfd72a7cbd7bd615af1b1a94/mGPT/metrics/m2t.py#L325-L329
In t2m.py https://github.com/OpenMotionLab/MotionGPT/blob/0499f16df4ddde44dfd72a7cbd7bd615af1b1a94/mGPT/metrics/t2m.py#L251-L254
m_lens are divided two times.
However, even I fix these errors, the results are still different. Have you solved this issus?
same issue
@weleen hi! Has this issue been resolved? We met the same issue.
hi, me too.
me too. Furthermore, I cant reproduce the real score of m2t on this MotionGPT paper and MotionGPT-2 paper. In these papers, R-Precision, MM Dist. are following:
However, when I run test.py in this repository, the R-precision and MM dist deviate by 0.2 points and 0.07 points, respectively.
I consider this to be a very problematic deviation.
@weleen hi! Has this issue been resolved? We met the same issue.
@LinghaoChan I think there are some mistakes in get_motion_embeddings.
In m2t.py
Lines 325 to 329 in 0499f16
m_lens = torch.div(m_lens, self.cfg.DATASET.HUMANML3D.UNIT_LEN, rounding_mode="floor") ref_mov = self.t2m_moveencoder(feats_ref[..., :-4]).detach() m_lens = m_lens // self.unit_length In t2m.py
Lines 251 to 254 in 0499f16
m_lens = torch.div(m_lens, self.cfg.DATASET.HUMANML3D.UNIT_LEN, rounding_mode="floor") m_lens = m_lens // self.cfg.DATASET.HUMANML3D.UNIT_LEN m_lens are divided two times.
However, even I fix these errors, the results are still different. Have you solved this issus?
It looks like the variable 'm_lens' is not actually used:
def forward(self, inputs, m_lens):
num_samples = inputs.shape[0]
input_embs = self.input_emb(inputs)
hidden = self.hidden.repeat(1, num_samples, 1)
cap_lens = m_lens.data.tolist()
# emb = pack_padded_sequence(input=input_embs, lengths=cap_lens, batch_first=True)
emb = input_embs
gru_seq, gru_last = self.gru(emb, hidden)
gru_last = torch.cat([gru_last[0], gru_last[1]], dim=-1)
return self.output_net(gru_last)
I am also facing the same issue.
Until now, I have tried the following modifications:
- I commented out the redundant division mentioned by @weleen ;
- I tried to use
emb = pack_padded_sequence(input=input_embs, lengths=cap_lens, batch_first=True)to usem_lens, according to the comments by @feifeifeiliu .
By doing so, the T2M results of the provided checkpoint (from huggingface) are as follows:
Compared with results posted by @weleen, after my modification Matching_score and R_precision get improved, but results on other metrics get even worse.
Most importantly, after doing this, there is still a clear gap between these results and those posted in the paper, even if comparing with MotionGPT(Pre-trained).
Really hope the authors or someone who has successfully reproduced this work could provide some hints on this issue🙏