MoE-Infinity icon indicating copy to clipboard operation
MoE-Infinity copied to clipboard

[Feature Request] Improve cold start latency with ServerlessLLM sllm_store

Open drunkcoding opened this issue 5 months ago • 0 comments

Prerequisites

  • [x] I have searched existing issues and reviewed documentation.

Problem Description

Model reading from disk slow, achieve only 2GB/s on 12GB/s SSD

Proposed Solution

  1. Add shm interface in sllm_store
  2. Change model loading pipeline to use state_dict

Alternatives Considered

No response

Additional Context

No response

Importance

Nice to have

Usage Statistics (Optional)

No response

drunkcoding avatar Jul 14 '25 15:07 drunkcoding