Yushi Bai comments

Results 102 comments of


                                            Yushi Bai

有人 train 成功了吗？

> # 附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息： > KeyError: '' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139, in train() File "/root/AI4E/ljc/LongWriter/train/main.py",...

有人 train 成功了吗？

> > > # 附上运行 ' ./scripts/glm4_longwriter.sh' 时的报错信息： > > > KeyError: '' Using unk_token, but it is not set yet. Traceback (most recent call last): File "/root/AI4E/ljc/LongWriter/train/main.py", line 139,...

有人 train 成功了吗？

> # 现在会报两种类型的错误 > * 系统环境： > > * python==3.11.9 > * transformers==4.33.0 > * pytorch==2.2.0 > * /glm-4-9b 目录下的 `modeling_chatglm.py`和 `tokenization_chatglm.py` 都已经替换 > > ## 错误一：RuntimeError: shape '[32768, -1,...

有人 train 成功了吗？

从 https://github.com/hiyouga/LLaMA-Factory/issues/5252 这个issue来看，`"stage3_prefetch_bucket_size": "auto"`报错可以通过降低deepspeed版本解决，试试`pip install deepspeed==0.14.4`。

有人 train 成功了吗？

@LYCnight @badarrrr 请看我们在[README](https://github.com/THUDM/LongWriter/blob/main/train/README.md)中的FAQ是否能解决你们遇到的问题。不好意思让你们久等了。

Potential Error in Benchmark Data – Incorrect Answer for Question ID 6701cda0bb02136c067cb6eb

Thanks for pointing it out! We will soon have our annotator check the data and update the dataset.

Hardware requirements for Training/Finetuning?

Hi, I'm sorry but I don't think RTX 3070Ti has sufficient memory size for training or running LongWriter model. We train on 8xH800 (80GB) for full fine-tuning (LoRA and quantization...

how to use a local LLM to evaluate prediction quality? For example, Llama-3-70B-Instruct?

Hi, you can use the evaluation code in [eval_quality.py](https://github.com/THUDM/LongWriter/blob/main/evaluation/eval_quality.py) to evaluate the generation quality. Substitute the API reference call `get_response_gpt4` with your local LLM model call.

How you evaluate reasoning models like QwQ-32B, since the response time and token length is very long?

We use Temperature=0.6, TopP=0.95, MinP=0, max_new_tokens=30000, max_input_len=100000. Remember to enable YaRN whilst evaluating.

How you evaluate reasoning models like QwQ-32B, since the response time and token length is very long?

I set timeout=3600. Your YaRN configuration is correct.