Cyrus Leung comments

Results 137 comments of


                                            Cyrus Leung

[Frontend] Support GPT-4V Chat Completions API

After some digging, it seems that this is expected behaviour because there is no extra whitespace after `"ASSISTANT:"`, so that the language model fills it in automatically, resulting in the...

[Frontend] Support GPT-4V Chat Completions API

After some offline discussion, I have handed over this feature to @ywang96, who has opened #5237. This PR is thus no longer considered a candidate for being merged, and only...

[New Model]: Support Phi-3

I think this issue can be closed now that #4298 has been merged.

[RFC]: Multi-modality Support on vLLM

Thank you for kickstarting this conversation! ### Re: Issues I fully agree with the issues which you have pointed out. I would like to add that the current prompt format...

[RFC]: Multi-modality Support on vLLM

> Generally, I agreed with @DarkLight1337's opinion about moving processing logics out from `Engine` to prevent modifying core code frequently. However, I think it's difficult to keep the processing logics...

[RFC]: Multi-modality Support on vLLM

> > @Isotr0py Perhaps we could follow a registry pattern and have each model separately register how to preprocess the inputs? If the model does not do so, then the...

[RFC]: Multi-modality Support on vLLM

> #### 2. Frontend input format > My comments on this are similar for Proposal 1. However, #4197 only refactors MultiModalData to define data processing logic. To avoid excessive duplication...

[RFC]: Multi-modality Support on vLLM

Just a heads up that #4228 will introduce another vision language model to vLLM, so our discussion should take that into account as well.

[RFC]: Multi-modality Support on vLLM

> I discussed with @zhuohan123 offline about this - in particular regarding this comment > > > To avoid having to modify the core Engine logic each time, we can...

[RFC]: Multi-modality Support on vLLM

> @DarkLight1337 Thanks for sharing the thoughts! @zhuohan123 and I actually discussed about the use of `AutoProcessor`. > > I think the point is that today `vLLM` already relies on...