sglang
sglang copied to clipboard
Implement prefix_cache
Thanks so much for the work on this repo so far.
I think prefix caching could be very useful and I see that vLLM is also starting to support it for some architectures.
It looks like the BaseBackend.prefix_cache method still needs to be implemented:
def cache_prefix(self, prefix_str: str):
pass