AmazDeng issues

Results 16 issues of


                                            AmazDeng

[Question] preprocess_plain，no question part

### Question Hi,authors, Thank you for your great contribution I've noticed that during the pretraining phase, the preprocess_plain method was used. This method discards the question part and directly concatenates...

[Bug]: llava inference result is wrong !

### Your current environment ``` Collecting environment information... PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.6...

bug

[Bug] llava, cuda out of memory

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest version....

LaVA-NeXT-video Training Hyperparameters

Hello Developers, Thank you for your outstanding work. Could you please provide the Training Hyperparameters used during the training of the LLaVA-NeXT-video and LaVA-NeXT-video-dpo model?

Concurrent inference failure of TensorRT 8.6.1 when running open_clip visual model tensorrt engine on GPU A100

## Description I compiled the image part of the open_clip model (a PyTorch model,https://github.com/mlfoundations/open_clip) in a Python environment using TensorRT 8.6.1, and obtained an engine. Then, I developed a service...

support llava-next model

Llama-next-image and Llama-next-image are fairly good multimodal models, and they are already supported in transformers. I would like to know if tensorrt-llm plans to support these two models? https://github.com/LLaVA-VL/LLaVA-NeXT https://huggingface.co/docs/transformers/model_doc/llava_next...

new model

how to set do_sample=False?

I tested the batch inference results of the llava and llava-next-video models using tensorrt-llm based on the examples/multimodal/run.py file. The parameters for their generate method are the same, as follows....

question

waiting for feedback

What is the difference between the two projects, lmms-lab/llava-onevision-qwen2-7b-ov and lmms-lab/llava-onevision-qwen2-7b-si?

I looked at the model card introduction but didn't see what the main differences are between these two models. Could the author explain?

The llava-onevision model video inference code has an error

For the llava-onevision model, the official video inference code does not modify the `image_aspect_ratio` parameter, resulting in the use of the default `anyres_max_9`. This causes the `image_features` to occupy a...

Why does the llava-onevision-qwen2-7b-ov model have such high GPU memory usage?

For the first version of the llava-next-video project, the model chosen was LLaVA-NeXT-Video-7B-DPO. If the number of frames is set to 32, the final inputs_embeds dimension sent to the llama2...