flame
flame copied to clipboard
Question about evaluate on KIT-ML
Hello, thank you for such a great project. I have a question when reading "Table 2: APE and AVE benchmark on the KIT dataset":
The evaluation results of Lin et al. (2018), Language2Pose, Ghosh et al. (2021) and TEMOS are exactly the same as those in the TEMOS paper. However, the comparative experiment in paper is the average of 3 times.
How can this be compared with other papers? Is it fair? In addition, there is no pre-training model or training and evaluation code for KIT-ML in the project. How can I reproduce your project on KIT-ML?