gantuo

Results 1 issues of gantuo

下面这块代码,我理解是,对于每个问题只取n个sample的第0个的分数的均值作为acc。那么n_sampling>1就没意义了。 [evaluate.py#line78](https://github.com/QwenLM/Qwen2.5-Math/blob/a45202bd16f1ec06f433442dc1152d0074773465/evaluation/evaluate.py#L78) ```python score_mat = [] for sample in samples: sample['score'] = scores[idx: idx+len(sample['pred'])] assert len(sample['score']) == len(sample['pred']) score_mat.append(sample['score']) idx += len(sample['pred']) max_len = max([len(s) for s in score_mat]) for...