Jin Shang

Results 32 comments of Jin Shang

I just submitted a draft PR: https://github.com/vllm-project/vllm/pull/3117. There are still some problems to solve. I would really appreciate any comments or advice.

> @Elsayed91 did you write your own BART implementation? What was the nature of the issue? > > Status update on encoder/decoder models & T5: > > It has become...

btw Bart would be simpler than T5 because it uses the original Transformer structure. Maybe we can do Bart first.

We've been using this branch in production and it works like a charm. Thanks so much for your contribution. Can't wait for it to be merged!

There's an [Abseil's article](https://abseil.io/about/design/swisstables) and a [CppCon talk](https://www.youtube.com/watch?v=ncHmEUmJZf4) that provide a good intro. The key idea is, when looking up a key via linear probing, it can check multiple slots...

> Maybe we should first having an memory-usage / performance benchmark on this? Okay, I'll try to draft a proof of concept with the `is_in` kernel.

> Ah, so it's a hash table with SIMD-optimized lookup? Well it's a speedup even without SIMD instructions. To put it in a oversimplified term each slot occupies 1 byte...

I wrote a simple proof of concept for `IndexIn` [1] with a slightly tweaked version of `absl::flat_hash_map`. (Tweak is [2], a couple of operations can be skipped because our `MemoTable`...

Sure. I chose Int32 because it was the only one tested with a large value set. I just wrote a similar set of benchmarks for strings and here's the result....

> Ok. I agree we could replace MemoTable with something better. However, we don't want to have a dependency on Abseil, so it will have to be reimplemented. Yes. I'll...