AmazDeng comments

Results 49 comments of


                                            AmazDeng

how to set do_sample=False?

> In TensorRT-LLM, if you don't setup beam_width (default value is 1), then it uses sampling. Under sampling, you could use top_k, top_p to control the sampling. If you set...

[Bug] InternVL2-Llama3-76B-AWQ infer result is wrong

@czczup @whai362 @ErfeiCui @hjh0119 @lvhan028 @Adushar @Weiyun1025 @cg1177 @opengvlab-admin @qishisuren123 @dlutwy Could you please take a look at this issue?

[Bug] InternVL2-Llama3-76B-AWQ infer result is wrong

I have updated the test images and the code. You can also test this case on your local machine. I only have an A100 80G graphics card, so I can...

[Bug] InternVL2-Llama3-76B-AWQ infer result is wrong

> @AmazDeng > > Can you try if this question works? > > ``` > question="Image-1: \nImage-2: \nAre these two pieces of coats exactly the same except for the color?...

[Bug] InternVL2-Llama3-76B-AWQ infer result is wrong

@irexyc I noticed that the prompt you provided contains the symbol, which is not included in the official version(f'Image-1: {IMAGE_TOKEN}\nImage-2: {IMAGE_TOKEN}\ndescribe these two images',https://internvl.readthedocs.io/en/latest/internvl2.0/deployment.html#multi-images-inference). Also,I noticed that there is an...

[Bug] InternVL2-Llama3-76B-AWQ infer result is wrong

Understood, thank you for your reply. If it's convenient, could you please help me resolve another issue I've raised?https://github.com/OpenGVLab/InternVL/issues/549 @irexyc

InternVL2-40B-AWQ+lmdeploy video infer speed is very slow

@czczup @whai362 @ErfeiCui @hjh0119 @lvhan028 @Adushar @Weiyun1025 @cg1177 @opengvlab-admin @qishisuren123 @dlutwy Could you please take a look at this issue?

InternVL2-40B-AWQ+lmdeploy video infer speed is very slow

@irexyc The example of video multi-round conversation and the provided code are quite similar. I also conducted tests following the video multi-round conversation example using `InternVL2-1B`. The inference time was...

InternVL2-40B-AWQ+lmdeploy video infer speed is very slow

> In my test, InternVL2-1B (which can not turbomind backend and will fallback to pytorch backend) takes an average of 5.4s. InternVL2-2B takes an average of 1.8s. @irexyc May I...

InternVL2-40B-AWQ+lmdeploy video infer speed is very slow

> @AmazDeng > > The test_code.py is different from the code you post before. In your previous code you set `max_dynamic_patch`. I test InternVL2-1B and InternVL2-2B with your previous code....