punica Inquiry on cuda memory across processes

Inquiry on cuda memory across processes

Open mozizhao opened this issue 1 year ago • 0 comments

Hi,

Congratulations on the great work you have done! I am very interested in your work. Specifically, I want to know how you allow multiple serving processes to share the same Cuda memory spaces (for the frozen parameters in the LoRA models).

Could you please point out the code? I want to study your implementation. Thanks!

BR//Zizhao

Jan 16 '24 14:01 mozizhao

punica punica copied to clipboard

Inquiry on cuda memory across processes

punica
punica copied to clipboard