yuanzhoulvpi comments

Results 95 comments of


                                            yuanzhoulvpi

[Feature] 建了个分支，支持多GPU部署，自动平均分配显存。

添加了单机多卡训练代码，链接放在这里，https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/Chatglm6b_ModelParallel

ggraph.index?

actually this error is very easy . just change `ggraph.index` into `.ggraph.index` ![image](https://user-images.githubusercontent.com/30610553/76693124-3f870a00-669b-11ea-8036-31c94d3e4966.png)

finetune垂类数据集 loss不降低

可能是因为使用fp16 1. 我使用fp16，训练5个epoch，loss依然不降低。而且loss还会有nan情况，模型的参数也不更新 2. 后来将fp16设置为False，loss才正常降低

finetune垂类数据集 loss不降低

用这个训练框架，搞了一个星期左右，怎么训练，loss都不行，后来还是自己写一个训练框架吧，感觉bge的finetune的loss，可能有点小问题，这是我的仓库代码https://github.com/yuanzhoulvpi2017/SentenceEmbedding

目前，我这边是这么做的： ```python peft_config = LoraConfig( task_type=TaskType.SEQ_2_SEQ_LM, inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1, target_modules=['query_key_value', 'dense','dense_h_to_4h','dense_4h_to_h'] ) model = get_peft_model(model, peft_config) accelerator.print(model.print_trainable_parameters()) ``` 本质上是使用lora来替换linear层。但是我现在遇到的问题是，模型精度没对齐。奇怪 ```bash RuntimeError: expected scalar type Half but found Float...

Finetune with LoRA

目前，我还在修改thuglm源码。 ✅ 基本上已经把`RuntimeError: expected scalar type Half but found Float`这个问题解决了。 💻使用的设备是`3090`。 🔗训练方法已经开源：[https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/train_thuglm](https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/train_thuglm) 但是遇到问题： 1. loss不下降，为nan 2. generate出来的id，不在token映射表里面。 3. 在3090上可以，在T4等别的显卡上就不一定可以了。这个确实是问题。目前这个几个问题，还没解决。上面这些点，我都会在仓库里更新。

yuanzhoulvpi

[Feature] 建了个分支，支持多GPU部署，自动平均分配显存。

ggraph.index?

finetune垂类数据集 loss不降低

finetune垂类数据集 loss不降低

Finetune with LoRA

Finetune with LoRA

Finetune with LoRA

非常喜欢你的这项工作，'shibing624/text2vec-base-chinese'库，好像不能加速，有办法解决吗

cpu下加速

AttributeError: 'ChatGLMForConditionalGeneration' object has no attribute 'enable_input_require_grads'