Woosuk Kwon issues

Results 65 issues of


                                            Woosuk Kwon

Support custom models

We need to provide clean abstractions and interfaces so that users can easily plug in their custom models.

Support custom tokenizer

We should provide a clean abstraction and interface so that users can use their custom tokenizer very easily.

Add CD to PyPI

Add code formatting script & Add CI to check code format

Use O3 optimization instead of O2 for CUDA compilation?

We are currently using the `-O2` flag in compiling our CUDA kernels. We need to investigate whether/how changing it to `-O3` affects the system performance and compilation time.

performance

[WIP] Add Falcon

Only works for Falcon-7B for now. The Falcon-40B model generates garbage outputs. Needs debugging.

[PyPI] Bump the version up to v0.1.2

Should be merged after #273

Add support for BLOOM

Closes #61 This PR adds the BLOOM model and modifies the paged attention kernel to support ALiBi bias.

Add support for MPT

Closes #218 and #332 Should be merged after #61

Weird beam search outputs

While playing with it I've stumbled upon strange behavior that might indicate that there is some issue when the beam search is used. I've started the server with: `python3 -m...