textgen 希望作者可以将最新的Aquila-7B和baichuan-7B模型集成进来

[✅] I checked to make sure that this is not a duplicate issue
[ ] I'm submitting the request to the correct repository (for model requests, see here)

Describe the solution you'd like

如题，希望作者可以把智源的Aquila-7B和百川的baichuan-7B集成进来，感谢🙏

Jun 16 '23 08:06 AILWQ

在训练中

Jun 16 '23 09:06 shibing624

训练代码是可以通用的，我稍微改下。

Jun 16 '23 09:06 shibing624

在训练中

感谢！

Jun 16 '23 10:06 AILWQ

baichuan-7B的训练已经兼容了： https://github.com/shibing624/textgen/blob/main/examples/gpt/training_baichuan_mydata_demo.py

Jun 16 '23 15:06 shibing624

baichuan-7B的训练已经兼容了： https://github.com/shibing624/textgen/blob/main/examples/gpt/training_baichuan_mydata_demo.py

感谢作者！但我在运行的过程中遇到了bug:

应该是在算交叉熵的时候input和target的维度不一致了，为什么会出现这个错误呢？

Jun 17 '23 05:06 AILWQ

代码更新了吗？出现这个错误的原因一般是collator后的input_ids 和 labels 维度不一致导致的。

Jun 17 '23 06:06 shibing624

代码更新了吗？出现这个错误的原因一般是collator后的input_ids 和 labels 维度不一致导致的。

下载安装了最新的代码，还是会有这个问题；另外，在跑ChatGLM-6B的时候出现了一个问题：

/data/home/scv9197/.conda/envs/competition/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:731: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:245.)
  tensor = as_tensor(value)

然后在加载数据的时候异常缓慢（7w的数据加载了两个半小时），之前没有出现过这个问题，不知作者是否对加载数据这块做了变动。

Jun 18 '23 04:06 AILWQ

不清楚你的数据格式是啥，有多轮对话格式吗？

另外，两个半小时是不正常的，一般就2分钟不到。

百川7b，我alpaca和belle-multi-round的数据都sft完成了的。如果数据有问题，可以用示例数据测试，没问题再上自己数据。

Jun 18 '23 13:06 shibing624

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动，机器人自动关闭此问题，如果需要欢迎提问)

Dec 27 '23 07:12 stale[bot]

textgen textgen copied to clipboard

希望作者可以将最新的Aquila-7B和baichuan-7B模型集成进来

Describe the solution you'd like

textgen
textgen copied to clipboard