Results 6 comments of zyksir

cc @shuaills is still working on this and this PR is not ready yet. I will leave it open

@FrankLeeeee Hi, Could you help to review this PR about VL?

this means that the code in inference engine(e.g sglang) also needs to change as well, right?

Now we have sgl online for training large models. we can use sglang as backend to support different models

@jiangtaozh which sglang version are you using? I previously come into this issue and it turns out to be something wrong with my sglang at that time.

@yd-oom This is feature is really exciting! could you please solve the conflicts? and did you test it using llama3.1B? Is the accept length good?