MedicalGPT icon indicating copy to clipboard operation
MedicalGPT copied to clipboard

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Results 45 MedicalGPT issues
Sort by recently updated
recently updated
newest added

### Describe the Question Please provide a clear and concise description of what the question is. 你好,我使用多个txt文档,lora的方法进行增量预训练,但是合并时出现: return config_cls(**kwargs) TypeError: LoraConfig.__init__() got an unexpected keyword argument 'layer_replication' peft是0.9.0 请问这个问题怎么解决?

question

### Describe the Question Please provide a clear and concise description of what the question is. 请问扩充词表后能否直接进行SFT呢?如果可以的话,需要改变哪些参数么?

question

大佬,在增量预训练时,使用的是企业的一些简介和经营范围进行尝试训练(所有数据都是领域数据的文本)差不多使用了10W条数据,但是在训练时,发现loss一直在缓慢的增加,请问您遇到过这种问题吗? 启动命令: CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 pretraining.py --model_type chatglm --model_name_or_path ../chatglm3-6b-32k --train_file_dir ./data/pretrain --per_device_train_batch_size 8 --per_device_eval_batch_size 8 --do_train --do_eval --use_peft True --seed 42 --max_train_samples -1 --max_eval_samples -1 --num_train_epochs 3 --learning_rate...

question

### Describe the Question Please provide a clear and concise description of what the question is. 请教下,有对比测试过有无增量预训练的效果么?增量预训练的效果是怎么评估的呀?

question

Traceback (most recent call last): File "F:\xiazai\MedicalGPT-main\reward_modeling.py", line 653, in main() File "F:\xiazai\MedicalGPT-main\reward_modeling.py", line 447, in main model = get_peft_model(model, peft_config) File "C:\Users\admin\.conda\envs\newrlhf\lib\site-packages\peft\mapping.py", line 136, in get_peft_model return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config,...

bug

Traceback (most recent call last): File "F:\xiazai\MedicalGPT-main\pretraining.py", line 781, in main() File "F:\xiazai\MedicalGPT-main\pretraining.py", line 722, in main trainer = SavePeftModelTrainer( File "C:\Users\admin\.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py", line 489, in __init__ self._move_model_to_device(model, args.device) File "C:\Users\admin\.conda\envs\newrlhf\lib\site-packages\transformers\trainer.py",...

bug

![image](https://github.com/shibing624/MedicalGPT/assets/22641510/a6f8ce2f-d7cb-4b31-b016-26ba33a946b3) ![image](https://github.com/shibing624/MedicalGPT/assets/22641510/1dff598a-00e6-41b7-b334-09707b751ca5) 一致卡在0,但是显存利用率是满的,不知道为啥

question

### Describe the Question 1.请问,pt阶段,基础模型比较大(Yi-67B),多机多卡用那种训练比较好呢? 代码是否支持呢 2.是否支持deepspeed 的 zero-1模式呢,怎么改呢,我看只支持zero2和zero3呢 3.长文本训练就设置--group_by_text True,多长算长呢?这种情况下block_size 还起作用吗 4.block_size 的作用是做什么呢? 期待大佬的回复!万分感谢!

question

如果我想把每个QA变成一个样本,而不是拼接起来再分块,这里要怎么改?

question