LongWriter icon indicating copy to clipboard operation
LongWriter copied to clipboard

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Results 29 LongWriter issues
Sort by recently updated
recently updated
newest added

### System Info / 系統信息 。 ### Who can help? / 谁可以帮助到您? 。 ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [X] My own modified...

我使用官方提供的脚本和数据集先后运行了python pre_tokenize_glm4.py python sort_and_group.py --group_size 8 --train_file /home/hnjj/diskdata/yuanshi/media/szf/llm/glm_longwrite/LongWriter/train/datasets 得到了attention_masks_pack.json ,inputs_pack.npy等文件 运行训练脚本 ./glm4_longwriter.sh 时,遇到与 DeepSpeedZeroConfig 配置相关的 ValidationError。错误是由于 stage3_prefetch_bucket_size 的输入类型无效,期望为整数但接收到浮点数。 训练日志: [2024-08-26 09:58:48,719] [INFO] [comm.py:683:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl...

bug

### Feature request / 功能建议 Hi Thanks for a great model. I was wondering if this fits my usecase. Essentially I'm looking for grounded generation. For eg. I give it...

enhancement

在ollama中导入hugging face上4bit量化后的gguf格式模型,在openwebui中提问,输出速度很慢。 ollama主机4060ti 16g型号的显卡显存占用才8G,显卡核心频率经常在210,很少到最大频率,7950x的CPU占用率50%。

script/main.py中 class DataCollatorForLMDataset(object): def __call__(self, instances: Sequence[Dict]) -> Dict[str, torch.Tensor]: input_ids, labels = tuple([instance[key].unsqueeze(0) for instance in instances] for key in ("input_ids", "labels")) input_ids = torch.cat(input_ids, dim=0) labels = torch.cat(labels,...

### System Info / 系統信息 这个需要多大得显存可以跑起来RTX4090 24G可以吗 ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [ ] The official example scripts / 官方的示例脚本 - [...

### System Info / 系統信息 transformers: 4.44.0 llama.cpp: latest Hi, when I try to make a gguf I get this error: > Traceback (most recent call last): File "/home/david/llm/llama.cpp/convert_hf_to_gguf.py", line...

### System Info / 系統信息 CUDA :11.1 transformers:4.44.0 Python:3.10.0 操作系统:Windows10 64 ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [X] The official example scripts /...

Often it gets stuck here. Like this: ![image](https://github.com/user-attachments/assets/c7ed38cf-0268-416e-8f37-846a14c6ae37) However, I'm running this from a gguf via GPT4ALL in Blender, so there might be multiple things causing this. I just wonder,...