Yushi Bai comments

Results 102 comments of


                                            Yushi Bai

点击Submit，页面显示错误，控制台显示AttributeError: 'PrelrainedlokenizerFast' object has no attribute 'build_chat_input'

你好，我们提供的trans_web_demo.py示例代码适配的是LongWriter-glm4-9b模型，如果要在LongWriter-llama3.1-8b上尝试此demo需要更改代码，请参考[hf readme](https://huggingface.co/THUDM/LongWriter-llama3.1-8b)中的推理代码进行更改。

eos_indices = input_ids.argmin(dim=1) - 1

这两行作用相同，可以统一为`eos_indice = (input_id == EOS_ID).int().argmax().item()`

eos_indices = input_ids.argmin(dim=1) - 1

谢谢提醒，我这两天更新代码

ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead.`

Hi, can you try the solution here: https://github.com/huggingface/accelerate/issues/2726?

ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead.`

你好，我们更新了模型的[modeling_chatglm.py](https://huggingface.co/THUDM/LongWriter-glm4-9b/blob/main/modeling_chatglm.py)以及github中的[trans_web_demo.py](https://github.com/THUDM/LongWriter/blob/main/trans_web_demo.py)，请试一下更新之后是否还有问题？

关于Claude和gemini的token处理

因为claude没有提供它们的tokenizer，我们在测试时也使用4o的tiktoken作截断，并使用二分查找找到按多少长度截断时不会报错（输入长度超过模型窗口长度会报错）。gemini我们目前还没有评测。

No initialization for the process group

Hi, you can change this line to ```python if dist.is_initialized(): dist.destroy_process_group() ``` to avoid the assertion error.

关于cot

你好，cot第一次请求会让模型输出chain of thought (let's think step by step)，第二次请求会让模型根据题目以及上一步输出的chain of thought输出最终选择。请参考我们的prompt：https://github.com/THUDM/LongBench/blob/main/prompts/0shot_cot.txt https://github.com/THUDM/LongBench/blob/main/prompts/0shot_cot_ans.txt

关于cot

[第二次请求](https://github.com/THUDM/LongBench/blob/main/prompts/0shot_cot_ans.txt)中*Let’s think step by step: COT*的"COT"是要用第一次请求得到的CoT输出替换掉的，第二次请求只是要求模型根据上一次请求得到的CoT总结得到答案。我们在[论文](https://arxiv.org/pdf/2412.15204)附录D中讲了CoT的测试方法，这种方法参考自[GPQA](https://arxiv.org/pdf/2311.12022)。

关于cot

只用一次prompt应该也可以，但是模型在zero-shot下可能不能很好地follow指令。