ldwang comments

Results 130 comments of


                                            ldwang

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2)

先关闭，如有问题重新打开issue，谢谢

Adafactor is not a supported DeepSpeed Optimizer

先关闭，如有问题重新打开issue，谢谢

[Question]: 需要什么硬件资源

先关闭了，如有需要重新打开。谢谢

[Question]: Where to download the model weights 模型文件在哪里下载

you can manually download model by visiting https://model.baai.ac.cn/models and searching by Aquila. By running example, models will downloaded automatically. huggingface usage is coming soon.

RuntimeError: shape '[1, 1, 1, 32, 128]' is invalid for input of size 16384

可以修改下 checkpoints_in/aquilachat-7b/config.json "flash_atten": true 修改成 "flash_atten": false

RuntimeError: shape '[1, 1, 1, 32, 128]' is invalid for input of size 16384

或者删除 checkpoints_in/aquilachat-7b/config.json 重试下

[Question]: 是否能支持 xformers，替换 flash_atten

后续可以支持xformers。使用flash_atten主要因为训练效率。可以修改模型配置中关闭flash_atten，使用原始atten来推理。

[Question]: 请问aquilachat支持多轮吗？

`conv = default_conversation.copy() conv.append_message(conv.roles[0], human_text) conv.append_message(conv.roles[1], bot_text) conv.append_message(conv.roles[0], text) conv.append_message(conv.roles[1], None)` 多轮的语料比较少。prompt构造可以这么试试，https://github.com/FlagAI-Open/FlagAI/blob/master/examples/Aquila/Aquila-chat/generate_chat.py#L39

[Question]: 请问aquilachat支持多轮吗？

预计发布1.7.2 会增加多轮的例子。

请问aquila模型预训练中使用了什么数据呢

Aquila预训练使用了Pile，[RedPajama-Data-1T](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T), [Wikipedia](https://huggingface.co/datasets/wikipedia), [C4](https://huggingface.co/datasets/c4), 悟道中文数据集、电子书、专利、百科、论坛, github数据等