HuangChiEn

Results 42 comments of HuangChiEn

> Can you achieve it?If not, I don't think it would be a good general-purpose model for dealing with multi-modal data. Be honest, we should agree this comment (even this...

> These are interesting research questions. It depends on the application and data. Many tasks nowadays don't need explicit cross-attention or encoder-decoders, and instead can leverage simpler strategies like concatenating...

> Did not u see the train data. The GT for LLaVA to output is trained to be "Sure, it's the ." It was trained to to say like that,...

> 我觉得可以不用纠结这个问题,如果你希望得到demo中的效果,可以考虑调研一下LISA++,LISA++可以很好的完成demo中的任务,对话更自然一些 有查到你提的那篇論文,但我找不到它的github;你提到LISA++可以很好的完成demo中的任务,那你知道在哪取得LISA++的github嗎 (source code和reproduce的權重) ?

then, let's wait and see that will LISA author give any comment ~

addition note : 11/06 we have also test the different checkpoint : "xinlai/LISA-13B-llama2-v1-explanatory", i haven't notice "explanatory" this term may denote the new feature i request in this thread. Note...

> @HuangChiEn Hi~I have been working on reproducing LISA recently. Have you noticed [Issue# 162](https://github.com/dvlab-research/LISA/issues/162)? I see that you are all paying attention to the reproduction of reason seg, but...

lambda Net, not exactly the same, but to reduce the computation resource https://arxiv.org/pdf/2102.08602

> Yes, the xdg-user-dirs is not the problem here. > > Since I am using 9.14 (or 9.15) I thought no additional steps are needed: > > > Starting from...

OFT help in your case ? https://huggingface.co/docs/peft/conceptual_guides/oft better to only training on lora twice. maybe second time, you can apply OFT, but not sure does peft fully support without any...