grantchenhuarong

Results 34 comments of grantchenhuarong

改变语料模仿医疗单指令微调,同时只跑10steps( 10X128X4 / 347 > 10轮) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/17b54e92-1009-4ee4-a4a5-30c7641f2139) 测试效果 ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/5c7cf9c7-51bc-46c3-9dd4-814e8c8607a5) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/a125ee9f-1681-4183-aaf3-1081ebdfb3e0) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/be3cf2df-2d27-479e-85d8-0772e9486679) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/0253e38b-f45e-402f-bb9b-83fde2613c53) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/d148e0a1-5372-4019-b52b-c1c6d7cde9a0) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/25574c7a-ef8d-4702-a4e9-80a72bf54389) 基本上训练语料的内容,都没有体现出来。。。有些困惑了。lora只能作为引子,去勾兑出llama底层模型中的数据么?如果原先训练的底层模型,它文字接龙创作出来的东西,其实也很难被有限的新增语料引导么? 那针对私域垂直领域知识模型,是否只能从头开始训练底层模型?这难度可不是一般的大啊。。。

但是大神,您那个medical的样例,可是让我羡慕的很啊,一定要整合出来20W数据,再跑一下么?其实也可以,我换个单指令,直接问诗句出处,这样应该能够制作更多的数据出来。 还有哪些要注意的地方么?2080ti都要被我给玩坏了都。

制作了20万数据如下 ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/b14f5ed1-1592-402f-b513-b5df9635ef57) 又准备开始炼丹了。。。祝我好运吧

谢谢大佬的回应哈,有几个问题咨询一下: 1、您说的20多个epoch,不是指steps,而是覆盖一次全量数据的训练度么? 2、我的batch_size用的默认的128,micro batch size是默认的4。需要调小到多小? 3、还有您是用 continue.sh,还是使用others_continue.sh脚本的参数执行的呀? 4、所给的数据,你是构造单指令微调的方式么?可否参考一下语料的构造方式? 5、您说的有一定效果,是指的能按指令出语料的应答数据么?符合度有多高呀?

也是从checkpoint-11600这两个epoch的lora模型训练的么?我也试着改成您的语料结构,试试347首诗词的训练。

继续努力中。。。 ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/caf42262-6dd4-41fe-9df6-bc0ee53c248d)

跑到11904异常退出,接着它再断点续传 ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/9a3b45c9-4947-483a-81ae-020d6cbf0496)

确实训练60个epoch后,效果也是出来了,感谢大佬细心指导。 ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/0cc5ec4e-cadd-4cf1-99ba-0c517c435239) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/37609385-7c64-4496-a184-33898ec04a7a) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/5ea52dfc-4daf-42bb-831f-69e638699f83) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/c9e4f4f2-409a-4da3-9065-6db1d055a074) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/01f91356-4451-48a4-a058-50e80595fb9a) 效果确实出来了,咱这种方法确实是可行的。 另外,对于一些超长的诗篇,如屈原的离骚之类的,一般是怎样处理的呢。将256搞成2048,估计都不够呢。

尝试去除instruction ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/2076e3b2-6f4e-40e3-af46-2eb7fd51263d) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/2d8c7cea-d74d-48db-88ff-a21d3836485b) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/2d968774-bc3a-47f1-8010-5baed06df607) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/f5062294-14d4-41af-9e95-20cfb74a8fca) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/f99789ff-2669-4ee2-86c7-1b16487daeca) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/1d88df3d-9cea-476d-b14e-2025473ad76e) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/d7da68c1-bf28-4eea-ac72-da648b156395) ![image](https://github.com/Facico/Chinese-Vicuna/assets/44857880/6e9c128d-953d-4381-8bd3-37b07f5c0a06)

@Facico 再次感谢,炼丹路上刚起步,期望能够跟上大佬脚步。