QIE TANG comments

Results 4 comments of


                                            QIE TANG

trafficstars

[Chatllama] assert _MODEL_PARALLEL_GROUP is not None, "model parallel group is not initialized"

Any update? I encountered same issue. I simply set environment variables by two command: export MP=1, export WORLD_SIZE=1 . Then start training of actor with "fairscale True" in config.yaml.

想保留原有的对话能力并增加现有的问题处理对话哪种更适合呢？lora还是ptuning？？？我还有个疑问，#413 说到ptuning微调之后就只支持当前任务了，这种同样是对话的任务微调之后之前的对话能力是否也会变差？如果想保留原有的对话能力并增加现有的问题处理对话是不是使用lora更适合？

你们batchsize都多大，accumulate是几

qlora可以结合deepspeed使用吗

还有就是，readme里面qlora的训练命令使用的是train.py,这个是不是写错了，应该用train_qlora.py？

qlora可以结合deepspeed使用吗

这个我问过transformers那边了，说目前deepspeed不支持4bit/8bit训练，所以目前只能ddp，zero optimization应该都是不行的