Junyang Lin

justinlin610.github.io [email protected]

Beijing Core maintainer of Qwen Team & OpenDevin

Results 173 comments of


                                            Junyang Lin

Hyper Parameters of OFA_HUGE model for finetuning to perform captioning

> Hi @JustinLin610, Can you release the bash file for OFA huge training with all the params if possible? Sorry for my late response. You mean finetuning parameters for OFA...

Qwen-14B-Chat微调后模型 + fastchat 0.2.29 在2x4090上推理速度比其他13B模型慢很多

> 做个调查，你们在训练或推理时有开启flash-attention吗？如果开启了那相较于旧版代码应该更快才对，因为v1.1里flash-attention的计算是去除了padding的。另外，在不开flash的情况下v1.1的代码相较于之前确实有可能慢一些，因为我们在计算softmax时先把attn_weights转成了fp32，这样可以减少精度损失，[softmax](https://huggingface.co/Qwen/Qwen-7B-Chat/blob/119ea939362a6311dc2450511e59e43cb5a5073c/modeling_qwen.py#L349-L351) Then问题主要还是在flash attn上？

Qwen-14B-Chat微调后模型 + fastchat 0.2.29 在2x4090上推理速度比其他13B模型慢很多

> > > > 我们对代码进行了速度优化，速度相较于之前提升了30%以上（w & w/o flash attention），hf已经更新，modelscope晚点也会同步上去。大家可以更新到最新代码试下（推荐使用torch 2.0以上的版本进行测试） > > > > > > > > > 已同步最新HF文件和代码，在Linux、V100 GPU、CUDA=11.7、pytorch=2.0.1、python=3.10、Transformers=4.33.1环境下，调用model.chat_stream，单卡性能7.8汉字/s；在多卡下性能2.2汉字/s，请问这是什么原因？ > > > > > > 可以分享下代码吗？我们排查一下问题 > >...

[Question] <请问是否可以开放无监督Pretrain的训练脚本？>

有点麻烦，得改dataset构造，等等吧

逻辑能力有待加强

基于什么做的怎么做的？

逻辑能力有待加强

OK 感谢反馈。猜测还是微调有对langchain格式适配提升效果的空间

OFA-OCR识别微调的模型checkpoints有开源么

We have released the multi-task finetuned checkpoints. See [checkpoints_cn.md](checkpoints_cn.md) and [scripts](https://github.com/OFA-Sys/OFA/tree/main/run_scripts/ocr)

Why are all the sentences generated by image captioning the same sentence?

Any more details? You can first check whether you can reproduce the performance following our instructions for finetuning on coco captions. If everything is alright, you can check your code...

where is the Chinese version of RefCOCO(-/+/g) caption data

We still did not release those data, but if I solve some issues about copyrights I'll make them public.

where is the Chinese version of RefCOCO(-/+/g) caption data

Yes you need to set `--bpe=bert`, cuz we use bert tokenizer for Chinese, and there is no encoder.json.

‹
1
2
...
9
10
11
12
13
14
15
16
17
18
›