Brosoul

Results 15 issues of Brosoul

### 🚀 Feature Description and Motivation Currently, each TP (Tensor Parallism) process in StreamLoader reads all model files, resulting in duplicate file transfers and reads, which reduces the overall loading...

kind/enhancement
area/acceleration
priority/critical-urgent
area/model-loader

### 🚀 Feature Description and Motivation In the parameter list of the loading tensor in the StreamLoader library, `device` should be refined to `device_map`. And StreamLoader library should support device_map...

kind/enhancement
priority/important-soon
area/model-loader

### 🚀 Feature Description and Motivation The network IO of a single process may have an upper limit, and adopting a multi process and multi thread approach can better utilize...

kind/enhancement
area/acceleration
area/model-loader

### 🚀 Feature Description and Motivation AI Runtime should support download models by parallel downloading of multiple files ### Use Case _No response_ ### Proposed Solution _No response_

priority/important-longterm
area/runtime
area/model-loader

### Summary The current runtime provides the ability to download model files from different sources, but lacks management capabilities for model files. ### Motivation This issue aims to provide model...

priority/important-longterm
area/runtime

### 🚀 Feature Description and Motivation StreamLoader can persist models to disk in bypass. This way can avoid the network transmission overhead of loading the model from the current machine...

kind/enhancement
priority/important-soon
area/model-loader

### 🚀 Feature Description and Motivation Multiple types of information are required during the startup process of AI Runtime: -Engine side: inference engine type, access address for engine metrics -Model...

priority/critical-urgent
kind/feature
area/runtime

### 🚀 Feature Description and Motivation ModelAdapter, like a Pod, displays the status on the result of get command. Displaying the status in this way will be more user-friendly. [image](https://github.com/user-attachments/assets/981696c4-8745-437a-8c1a-6ca388f1fa8c)...

good first issue
help wanted
kind/support
area/lora

### 🚀 Feature Description and Motivation The current implementation of multi threading for downloading model from the HuggingFace is directly using `max_workers` in `snapshot_download`. > 1 thread = 1 file...

priority/important-soon
area/runtime
area/model-loader

### 🚀 Feature Description and Motivation We need some theoretical support between network IO and the number of processes and threads to optimize the settings of StreamLoader processes and threads....

kind/enhancement
priority/important-soon
area/model-loader