Phi-3CookBook Need Phi-3-vision batch size

Need Phi-3-vision batch size > 1

Open hommayushi3 opened this issue 6 months ago • 2 comments

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ x ] feature request
- [ x ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Details

I am currently running a finetuned Phi-3-vision through VLLM, but my throughput is capped at the speed of a single threaded inference, despite running concurrent calls. I figured out that continuous batching is correctly batching the calls, but the batch inference of the Phi-3-vision model is linear in time with the batch size.

Investigating further, I noticed in your training scripts it mentions that batch size > 1 is not supported. I also noticed this PR/Issue, which has an potential fix, though it seems unverified.

My questions are:

Is this issue with batched requests fixed with Phi-3.5-vision? (I see batch size = 64 in this training script)
If not, is there a plan/timeline to allow for batched training/inference?

Thank you so much in advance for your help!

Aug 21 '24 14:08 hommayushi3

Phi-3CookBook Phi-3CookBook copied to clipboard

Need Phi-3-vision batch size > 1

This issue is for a: (mark with an x)

Details

Phi-3CookBook
Phi-3CookBook copied to clipboard

This issue is for a: (mark with an `x`)