Chen8566

Results 3 comments of Chen8566

同问,使用SFT或者LORA的脚本训练后,生产出的模型需要怎么进行Chat或者推理?似乎没办法再用model.chat的方式了,是不是需要对输入做一些前置处理?

> > 同问,使用SFT或者LORA的脚本训练后,生产出的模型需要怎么进行Chat或者推理?似乎没办法再用model.chat的方式了,是不是需要对输入做一些前置处理? > > 试试下面的代码: import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer path = '/mnt/ly/project/MiniCPM/models/MiniCPM' # 替换你的基础模型路径 device = torch.device("cuda:2" if torch.cuda.is_available() else "cpu") #...

> 这是我训练的模型,输出也很烂,不知道是为什么? > > ![image](https://private-user-images.githubusercontent.com/66230782/309649539-45222976-557a-499a-8530-e2cacfbb9da5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDk1NzMyMzQsIm5iZiI6MTcwOTU3MjkzNCwicGF0aCI6Ii82NjIzMDc4Mi8zMDk2NDk1MzktNDUyMjI5NzYtNTU3YS00OTlhLTg1MzAtZTJjYWNmYmI5ZGE1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMDQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzA0VDE3MjIxNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTFkY2YxZWViNjViOGMwOTgzYjAzOGE0ODRhY2E3ZDY5OThhMGQzMzRjNGU0OGE1NmY0NThkMDg3MmM4ZDc2OWQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.6ryTys4O4OVYYKsEPlxJIWFmv35EXbF1G0w0zRfi2JA) 我是4张卡,--per_device_train_batch_size 2,lora的训练效果很差。