Cyrus Leung
Cyrus Leung
After some digging, it seems that this is expected behaviour because there is no extra whitespace after `"ASSISTANT:"`, so that the language model fills it in automatically, resulting in the...
After some offline discussion, I have handed over this feature to @ywang96, who has opened #5237. This PR is thus no longer considered a candidate for being merged, and only...
I think this issue can be closed now that #4298 has been merged.
Thank you for kickstarting this conversation! ### Re: Issues I fully agree with the issues which you have pointed out. I would like to add that the current prompt format...
> Generally, I agreed with @DarkLight1337's opinion about moving processing logics out from `Engine` to prevent modifying core code frequently. However, I think it's difficult to keep the processing logics...
> > @Isotr0py Perhaps we could follow a registry pattern and have each model separately register how to preprocess the inputs? If the model does not do so, then the...
> #### 2. Frontend input format > My comments on this are similar for Proposal 1. However, #4197 only refactors MultiModalData to define data processing logic. To avoid excessive duplication...
Just a heads up that #4228 will introduce another vision language model to vLLM, so our discussion should take that into account as well.
> I discussed with @zhuohan123 offline about this - in particular regarding this comment > > > To avoid having to modify the core Engine logic each time, we can...
> @DarkLight1337 Thanks for sharing the thoughts! @zhuohan123 and I actually discussed about the use of `AutoProcessor`. > > I think the point is that today `vLLM` already relies on...