Jiaxin Shan

Results 742 comments of Jiaxin Shan

@Kolhax I don't think there's a ready to use SDK. I feel it won't be hard to build a SDK on top of the rest API by your own.

@AngainorDev Thanks for the feedback. I will give it a try and report feedbacks later.

Just a comment it eventually times out. @zhisbug A quick question, how did vicuna workaround this issue in the past and successfully save the weights? ``` 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [01:54

I can pick this issue up and add common parameters.

Seems FSDP is still developing its support for parameter-efficient training. The author of lora support suggest to use deepspeed at this moment. Check https://github.com/lm-sys/FastChat/pull/138#issuecomment-1495289110 for more details

@mahlernim Thanks for the reply. The tricky thing is they share some samples but those example don't seems to be able to calculate the accuracy (91.25% vs 87.5%). I think...

self-instruct code is definitely not open sourced yet. You can send me email and we can discuss some details if you are interested in. @cquliujian @FanWan

If anyone is familiar with chatGLM model architecture, feel free to help on #625. I am new to transformer architecture and not sure if my changes is correct..

If anyone is familiar with chatGLM model architecture, feel free to help on #625. I am new to transformer architecture and not sure if my changes is correct..

作为prompt的一部分,是否相当于做了两步. 1. 判断是否可以参考 2. 如果不参考,直接利用模型信息回答? BTW, 这个paper里面信息量太少了..