Feng Li comments

Results 119 comments of


                                            Feng Li

Question regarding multi image inference - import vs demo

Hi, are you using the llava-next-interleave model or the original single-image model?

Question regarding multi image inference - import vs demo

Hi, `(img_num, 3, 384, 384)` works for our model for multi-image setting. `(img_num, k, 3, 384, 384)` also works for our model to process anyres single-image.

process multi images

Hi, LLaVA-Next-Interleave version is out, which naturally supports multi-image interleaved inputs. Please refer to [this evaluation code](https://github.com/LLaVA-VL/LLaVA-NeXT/blob/inference/llava/eval/model_vqa.py) for the input format. It can directly handle the input format you provide....

是否有pytorch 版本

I believe this version is pytorch version.

Memory allocation problem

Sorry for the late reply. How much memory do you need in our case? We use about 30G for Resnet50 batch size 4.

Eval results

Hi, thanks for this question. You are using the first version of the eval json. We have updated the evaluation json. Please download a new one from our website. LMK...

Eval results

How about the 7b results? Do they match the results in the table?

Eval results

Hi, thanks for your feedback. We find out that the previous 7B model is not our best model and we opensource a wrong one. You can download our new 7B...

How to export maskdino into onnx format?

We did not try oonx before, so we could not offer any suggestions now.

Is there any provision for training models with gradient accumulation?

Sorry, we did implement that. You are welcome to open a PR if you have any ideas.