yxchng comments

Results 91 comments of


                                            yxchng

Where to download the videos for sharegptvideo_qa_255k.json?

@hello-bluedog 这个只是标注，没有视频，想问的是对应视频在哪里下

Suggestions on making Qwen3-VL evaluation prompts/scripts public，benchmark Failed to reproduce

@ShuaiBai623 any updates?

something wrong with preprocess_waymo

any updates?

Would VLLM be used to accelerate model inference?

@PhoenixZ810 can lmdeploy use multiple gpus? right now, it is extremely slow to eval, especially when evaluating r1-like models with tens of thousands of output tokens

Investigate performance discrepancies in gte-Qwen and NV-embed models

is AmazonCounterfactualClassification 83.16 or 86.15? why the leaderboard is 86.15. My reproduction gives 83.16 similar to the number reproduced in this issue above.

Investigate performance discrepancies in gte-Qwen and NV-embed models

@KennethEnevoldsen yes this is how i run. The result above also show 83.16. Is the result in the screenshot wrong?

Tensor shape mismatch in `get_rope_index` when handling truncated sequences in Qwen2-VL

any updates on this? what is the right fix?

think with images榜单中hr_bench指标异常

any updates on this?

Why results for HumanEval and MBPP for llama 3.1 different from those in evalplus leaderboard?

@jackyoung96 are you saying that you are able to get 70.7 and 77.0 for llama-3.1-8b-instruct?

Why results for HumanEval and MBPP for llama 3.1 different from those in evalplus leaderboard?

@ganler are you aware of any methods that i can do to bring the results closer?