sglang Implement prefix

Implement prefix_cache

Open pj-ml opened this issue 1 year ago • 0 comments

Thanks so much for the work on this repo so far.

I think prefix caching could be very useful and I see that vLLM is also starting to support it for some architectures.

It looks like the BaseBackend.prefix_cache method still needs to be implemented:

    def cache_prefix(self, prefix_str: str):
        pass

Jan 26 '24 14:01 pj-ml