Phi-3CookBook
Phi-3CookBook copied to clipboard
Need Phi-3-vision batch size > 1
This issue is for a: (mark with an x
)
- [ ] bug report -> please search issues before submitting
- [ x ] feature request
- [ x ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Details
I am currently running a finetuned Phi-3-vision through VLLM, but my throughput is capped at the speed of a single threaded inference, despite running concurrent calls. I figured out that continuous batching is correctly batching the calls, but the batch inference of the Phi-3-vision model is linear in time with the batch size.
Investigating further, I noticed in your training scripts it mentions that batch size > 1 is not supported. I also noticed this PR/Issue, which has an potential fix, though it seems unverified.
My questions are:
- Is this issue with batched requests fixed with Phi-3.5-vision? (I see batch size = 64 in this training script)
- If not, is there a plan/timeline to allow for batched training/inference?
Thank you so much in advance for your help!