VBench Vbench Leaderboard

Hi,

How can I get the Total Score, Quality Score, Semantic Score and Selected Score locally?

Jun 24 '24 03:06 BeachWang

We have updated the offline calculation script of the t2v model at https://github.com/Vchitect/VBench/blob/master/README.md#get-final-score-and-submit-to-leaderboard. You may try it locally and before your submit.

Jun 28 '24 06:06 yinanhe

Thank you very much for your response. I have carefully reviewed this portion of the code and noticed that static metrics, such as temporal flickering and motion smoothness, are assigned high weights. On one hand, QUALITY_WEIGHT = 4 will quadruple their effect, and on the other, the parameters in NORMALIZE_DIC may amplify them by a factor of 3 to 5. Have you conducted any experiments previously or do you have any rationale behind these choices?

Jul 08 '24 12:07 BeachWang

Thank you very much for your response. I have carefully reviewed this portion of the code and noticed that static metrics, such as temporal flickering and motion smoothness, are assigned high weights. On one hand, QUALITY_WEIGHT = 4 will quadruple their effect, and on the other, the parameters in NORMALIZE_DIC may amplify them by a factor of 3 to 5. Have you conducted any experiments previously or do you have any rationale behind these choices?

(1) The weights are used to balance the influence from each dimension, since different metrics have different score ranges. (2) There are fewer quality-related metrics compared to video-text alignment (semantic) metrics, and the weights partially help avoid underrepresentation. (3) The rationale is to have overall ranking faithfully reflect the a model's holistic performance. We conducted the experiments on earlier models where the ranking and model quality gap is prominent.

May 06 '25 08:05 ziqihuangg