PENG WENJIE comments

Results 4 comments of


                                            PENG WENJIE

How do models do batch inferring when using the transformer method?

It doesn't seem to work. Reasons: 1) Inference time is the same as a single inference, 2) console warnings appear one by one, it can be inferred that the model...

How do models do batch inferring when using the transformer method?

Thank you for your detailed explanation @Rocketknight1 . I have started using the vllm method, which enables efficient inference. But I'll try to use the model.generate() method for batch generation....

How do models do batch inferring when using the transformer method?

Sure, This is a website for your reference: https://docs.vllm.ai/en/stable/getting_started/quickstart.html. I find that vllm seems to be inferior to transformers method in batch inference. Maybe there is something wrong with my...

How do models do batch inferring when using the transformer method?

I read the issue and tried your code, which worked perfectly. Thank you for your contribution