zhangye0402 issues

Results 5 issues of


                                            zhangye0402

RuntimeError : Error(s) in loading state_dict for LlavaLlamaForCausalLM

上面是我修改后的v1.5/finetune_lora.sh文件在使用上述 v1.5/finetune_lora.sh 进行微调后，进行推理时出现图片上的错误，推理代码如下图所示：请问大家我是哪里弄错了呢？我看到提问区有类似的问题或者能不能请作者出一个在custom dataset上进行进一步微调的指导文件呢？请救救我。此外，我还想请教大家一个问题，就是如果我在只有图像的特定任务数据上进行微调，会不会提升我在相同任务的视频数据集上的效果呢？毕竟，图像数据还是更方便搜集一些的。

v1_2多图微调问题

非常厉害的模型，我这边主要有两个问题想要请教下： 1.第一个问题就是标题中的问题，1.2版本是否支持多图输入的微调呢？具体而言，在第一轮对话中给出多张图像及问答对的微调形式是否可行呢？大概是这个意思： { "id": 0, "image": "images/5.png","images/6.png","images/7.png", "conversations": [ { "from": "human", "value": "\n第一轮对话的问题" }, { "from": "gpt", "value": "第一轮对话的回答" }, { "from": "human", "value": "第二轮对话的问题" }, { "from": "gpt",...

多图推理的最大分辨率问题

在官方给出的demo中，使用了example文件夹下的image1以及image2，可以顺利进行multi-image conversation 但如果简单将image1、2换成image4、5，模型就只能识别到其中的一张图片，查看分辨率发现image4的分辨率达到了1920x1200,而image2的1000x1000分辨率就不会出现问题。想请问下进行multi-image conversation的话，单张图像的分辨率上限是多少呢？同时模型最多可以支持多少张图像的输入呢？ @czczup 谢谢！^_^

[BUG] <title>模型不能正确分辨出输入图像的顺序

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an...

About the Processing of Trajectory Data

Could you kindly provide a detailed explanation of how the trajectory data is processed? Specifically, could you share more details about the process of discretizing the trajectory data? Thank you...