DefTruth comments

Results 223 comments of


                                            DefTruth

Does TensorRT-LLM support passing input_embeds directly？

> > 我也好奇这个input_embeds如何直接传，不确定你这里直接传input_embeds的具体需求是什么，是否和我一样。不过InternVL2这个可以使用trt-llm跑起来，使用pre + img + post拼prompt的形式。这个token id是在输入trt-llm之前确定好，实际输入trt-llm decoder engine的时候，和图像的visual_feature一起传入decoder engine，input_ids在其中进行embed后和visual_feature一起concat，这个是可以实现的。 > > > > I'm also curious about how input_embeds can be directly passed. I'm not sure about...

Does TensorRT-LLM support passing input_embeds directly？

> > > 我也好奇这个input_embeds如何直接传，不确定你这里直接传input_embeds的具体需求是什么，是否和我一样。不过InternVL2这个可以使用trt-llm跑起来，使用pre + img + post拼prompt的形式。这个token id是在输入trt-llm之前确定好，实际输入trt-llm decoder engine的时候，和图像的visual_feature一起传入decoder engine，input_ids在其中进行embed后和visual_feature一起concat，这个是可以实现的。 > > > I'm also curious about how input_embeds can be directly passed. I'm not sure about...

Does TensorRT-LLM support passing input_embeds directly？

> > @Oldpan internvl2-2B 跑起来推理总是输出max_token数，这是为什么 > > 我猜是end_id没设对我猜你猜得对