ChatLM-mini-Chinese icon indicating copy to clipboard operation
ChatLM-mini-Chinese copied to clipboard

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Results 12 ChatLM-mini-Chinese issues
Sort by recently updated
recently updated
newest added

Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to h ang. It is recommended to upgrade the kernel to the minimum...

是否考虑将预训练的模型和仅stf后的模型也上传的平台呢

使用huggingface实现的sft_train.py 中有实现对应的embeeding和encoder冻结么?

rank1]:[E ProcessGroupNCCL.cpp:537] Some NCCL operations have failed or timed out. Due to the asynchronous nature of CUDA kernels, subsequent GPU operations might run on corrupted/incomplete data. [rank1]:[E ProcessGroupNCCL.cpp:543] To avoid...

现在bellegroup的train_3.5M_CN 数据的格式是这样 { "id":"66623028", "conversations":[ { "from":"human", "value":"归纳介绍三种不同类型的电影题材" }, { "from":"assistant", "value":"1. 动作片:动作片通常强调动作和战斗场面,故事情节紧凑,充满了紧张和兴奋的氛围。主角一般是具有超凡实力或特殊技能的英雄人物,往往会与反派角色进行一系列激烈的格斗,追逐和爆破等酷炫场面。例如《速度与激情》和《复仇者联盟》等。\n2. 爱情片:爱情片是以爱情为主题的电影,着重呈现角色之间的感情纠葛,展现了爱情的美好和悲壮。情节与角色之间的发展令人深思,增加了观众的感性体验。这种电影的故事情节主要以人类与人类之间的情感变化为主题。例如《泰坦尼克号》和《让子弹飞》等。\n3. 科幻片:科幻片是以未来科技和未知世界为主题的电影,通常包括时间旅行、外星人、虚拟现实、未来社会等各种奇幻的元素。这种电影描绘了一种比现实更加宏伟、更加神秘和惊奇的世界。例如《星际穿越》和《发条橙》等。" } ] } 跟train_2M_CN的格式不同,目前的数据处理代码无法处理train_3.5M_CN,这个数据目前是多轮对话的形式,这个数据是直接舍弃,还是可以修改代码再用呢

小模型做其他任务不太行,或许可以挖掘一下agent函数调用的潜力

运行3.4python ptr_train.py时报错 (base) D:\pycharmenv\ChatLM-mini-Chinese>python pre_train.py D:\SOFTWAREE\anacondaa\Lib\site-packages\transformers\utils\generic.py:441: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead. _torch_pytree._register_pytree_node( D:\SOFTWAREE\anacondaa\Lib\site-packages\transformers\utils\generic.py:309: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead. _torch_pytree._register_pytree_node( Traceback (most recent call last):...

我已经把配置文件改小了: `class T5ModelConfig: d_ff: int = 1024 # 全连接层维度 d_model: int = 512 # 词向量维度 num_heads: int = 8 # 注意力头数 d_model // num_heads == d_kv d_kv: int = 64...