josephrocca
josephrocca
For others who are hitting this issue, but who desperately want to use LMDeploy, you can of course remove the `stop` parameter, and then manually check for the `stop` strings...
@nathanwhit Thanks for your comment! ~~I've investigated this a bit further, and it looks like the "leaked" memory actually maxes out at about 500mb, no matter how many workers are...
Hey @nathanwhit, just wondering if V8 has been able to fix the "shared RO heap" issue you mentioned yet? Workers are still about 7x larger than Chrome, and 5x larger...
@ije `cbor-x` doesn't require `Buffer` when used in the browser - e.g. for the cbor-x official index.min.js: https://cdn.jsdelivr.net/npm/[email protected]/dist/index.min.js it's 33k instead of ~60k since it doesn't need Buffer in the...
### Related: > [**CacheGen [SIGCOMM'24]**](https://arxiv.org/abs/2310.07240): efficiently encodes KV caches into bitstreams and stores them on disks. This allows unlimited amount of KV caches to be stored on cheap disks and...
> With that in mind, it'll be much easier to assess a correct caching solution. Gotcha, makes sense. For reference, I use sticky sessions, and it's not much of a...
Possibly related (I'm unsure, but linking just in case): * https://github.com/sgl-project/sglang/issues/6312 * https://github.com/sgl-project/sglang/pull/6842 * https://github.com/sgl-project/sglang/pull/6911
There's the potential for something really powerful here - more than just a userland API for the current system function. It'd be very useful to be able to open multiple...
Potentially relevant viral (1.7k likes) [tweet](https://twitter.com/DX_Nacca/status/1770775580754981125) in the VRChat community from yesterday: > DXOverlay, Show your friends when you have the Overlay open. Integration with XSOverlay directly into VRChat via...