Guoli Yin comments

Results 12 comments of


                                            Guoli Yin

Add TensorFlow Ops for T5

@byshiue will the FT op be in the roadmap for the next release? TF op turns out to be faster than th op from the decoder(decoding) benchmark and is easier...

Add TensorFlow Ops for T5

> > @byshiue will the FT op be in the roadmap for the next release? TF op turns out to be faster than th op from the decoder(decoding) benchmark and...

alibi error if log2(num_heads) is not an integer

candidate change: ```python class ALiBi(Module): @staticmethod def create_alibi_matrix( q_sequence_length: int, k_sequence_length: int, num_heads: int, offset: int, dtype=mx.float32, ): x1 = mx.arange(offset, q_sequence_length) x2 = mx.arange(0, k_sequence_length) distance_matrix = -mx.abs( mx.expand_dims(x1[:,...

Support for MPT-7B and MPT-30B

+1. I think both bloom and mpt just need alibi implementation and then vllm could support both.

Support for MPT-7B and MPT-30B

@WoosukKwon can I ask about the timeline for it when you mention it very soon? is it like in 2 weeks or 4 weeks roadmap? thanks

Support BLOOM

expose max_decode_len and eos_token_id in decoding

> Hi Guoli, what's the use case? Should we first discuss in an internal PR? sg. let's discuss in an internal PR firstly.

[Enhancement] Add PPO/GRPO

cc @dongyin92

[Frontend] support image embeds

@chaunceyjiang thanks for adding this! may I ask whether this change is also compatible with MultiModalHasher? https://github.com/vllm-project/vllm/blob/084bbac8cc4c29b7dcd2098418168c61d3d42e9b/vllm/multimodal/hasher.py#L24 when we enable prefix caching, the image_embeds shall be hashable as well?

[Frontend] support image embeds

@DarkLight1337 thanks for sharing. from the code, it looks like it will create a hash key from mm_data? and it will include the type of image_embeds as well if I...