shpgy-shpgy

Results 2 comments of shpgy-shpgy

> It looks like the first engine has somehow taken most of GPU memory (95% KV cache usage). In this case the second engine cannot allocate KV caches and hence...

> [@shpgy-shpgy](https://github.com/shpgy-shpgy) Thanks for providing the detailed `nvidia-smi` information. > > Yeah, as I can see from the screenshots, the GPU is almost running out of memory. When one instance...