Jiaxin Shan
Jiaxin Shan
@better629 I have not tried 30b yet and we will explore 30B or 65B later and let you know the results
@Chesterguan Can you provide some logs from controller side and model worker side?
@zhangzhengde0225 I workaround the issue by using 8*A100(80G). I got similar results like yours, the training process went smooth but the error happened during the model weight persistent. Check this...
@simon-mo for prefill disaggregation. from the splitwise and distserve paper, they all build solution on top of vLLM for evaluation. Any contribution from these teams? is vLLM community open for...
@kenplusplus dynamic batch size won't be controlled by external autoscaling logic. That's two different levels.
https://github.com/ray-project/kuberay/issues/861 If we plan to make any API changes (remove any unhelpful fields), clean up is a breaking change and let's support multi version in beta version.
@kevin85421 I feel it's worth having some community plugins. Core part could be lightweight and there's no conflicts.
@DmitriGekhtman partial problems are fixed. I will try to fix rest of them soon
@MadhavJivrajani not a problem now. It can be closed. Thanks for the follow up.