juemifuji issues

Results 5 issues of


juemifuji

关于代码中Transformer输入格式的疑惑

你好！我看代码里面Transformer输入是TR([tr_input[i], tr_input[i]])，但是其具体函数定义格式又是：def call(self, inputs, mask=None, training=None, **kwargs)，其中的参数mask，要求是和tr_input[i]同shape或者是(batch_size, 1)，不知道是不是我哪里有遗漏，谢谢！

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 目前只支持了数据并行，使用8张A800，但是训练效率依然很低。请问后续是否会推出模型并行+数据并行？ ### Expected Behavior _No response_ ### Steps To Reproduce...

[BUG/Help] <ChatGLM-6B会不会开放GLM框架下的微调代码>

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 你好！目前ChatGLM-6B模型的微调代码是另外写的一套，请问后面会不会开放基于GLM代码框架的微调代码？（ChatGLM-10B是基于GLM框架进行微调） ### Expected Behavior _No response_ ### Steps To Reproduce...

模型并行问题

训练chatglm-6b模型，可以使用模型并行的方式了！！！请点击链接查看[Chatglm6b_ModelParallel](https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/Chatglm6b_ModelParallel)，目前这个版本，虽然在训练的过程中，loss下降了，但是模型学习不到内容，这个问题我还在排查。请问这个问题解决了吗

juemifuji

关于代码中Transformer输入格式的疑惑

BELLE-0.5M-CLEAN数据

[BUG/Help] <模型并行问题>

[BUG/Help] <ChatGLM-6B会不会开放GLM框架下的微调代码>

模型并行问题