punica icon indicating copy to clipboard operation
punica copied to clipboard

Inquiry on cuda memory across processes

Open mozizhao opened this issue 1 year ago • 0 comments

Hi,

Congratulations on the great work you have done! I am very interested in your work. Specifically, I want to know how you allow multiple serving processes to share the same Cuda memory spaces (for the frozen parameters in the LoRA models).

Could you please point out the code? I want to study your implementation. Thanks!

BR//Zizhao

mozizhao avatar Jan 16 '24 14:01 mozizhao