Wang Siyuan
Wang Siyuan
> 您好, > > 我测试的Llama-3.1-8B-Instruct 结果如下: > > Model Overall Easy Hard Short Medium Long Llama-3.1-8B-Instruct 29.0 30.7 28.0 33.9 25.6 27.8 > > 和排行榜中的Overall 有一个点的差距(29.0 vs 30.0),我的环境如下: > >...
> Thanks for pointing it out! We will soon have our annotator check the data and update the dataset. Thank you for your prompt response and for looking into the...
same problem here.
> Hi! So the dataset we are using is missing the fewshot split. It uses the test split for the fewshot samples and looks like one of the rows in...
The randomness here is not only caused by the random seed. If not explicitly set, the default random seed is 0, so it is reasonable that the results may differ...
[input_embeds not checking pad token](https://github.com/huggingface/transformers/blob/3f06f95ebe617b192251ef756518690f5bc7ff76/src/transformers/models/llama/modeling_llama.py#L1316C17-L1318C70) ```python if self.config.pad_token_id is None: sequence_lengths = -1 else: if input_ids is not None: # if no pad token found, use modulo instead of reverse...
> [input_embeds not checking pad token](https://github.com/huggingface/transformers/blob/3f06f95ebe617b192251ef756518690f5bc7ff76/src/transformers/models/phi3/modeling_phi3.py#L1490C9-L1499C38) > > ```python > if self.config.pad_token_id is None: > sequence_lengths = -1 > else: > if input_ids is not None: > # if no...