Claude-Liu

Results 6 comments of Claude-Liu

> it seems like `AsyncRolloutRequest` will update the position_ids and discard prompts['position_ids'] yes. AsyncRolloutRequest._get_position_ids should be updated. We can also use agent_loop instead, setting rollout.mode = async

Hi, I try to add the feature of icl in evaluate_vqa.py like below. However 1,3,5 shot decrease the performance of intervl-8b on textvqa, vizwiz and okvqa. I reopen this issue...

Thank you for your quick and clear response! _"We believe this might be because ICL changes the distribution of the model's outputs, causing the predicted results to fail to match...

class AgentLoopWorkerBase: """Agent loop worker takes a batch of messages and run each message in an agent loop.""" def __init__( self, config: DictConfig, server_handles: list[ray.actor.ActorHandle], reward_router_address: str = None, ):...

I see. It is not a good question. Thanks for your patience!

贴一下个人的尝试几个推理参数后比较好的一组(虽然差别都不是很大) generation_dict = { "do_sample": True, "temperature": 0.2, "top_p": 0.3, "top_k": 100, "max_new_tokens": 100, } model: qwen2-72b-instruct textvqa:vqa_score: 0.770 qwen2-7b-instruct 的结果是和官方结果一样的, 另外qwen2-72b-instruct在其他一些数据集上的测试也和官方结果基本吻合。 有些疑惑,不知道是细节上哪里做的不对。 千问的同学如果看到麻烦解答一下,感谢🙏!