aibrix StreamLoader library support device_map to allocate tensor to different device

🚀 Feature Description and Motivation

In the parameter list of the loading tensor in the StreamLoader library, device should be refined to device_map. And StreamLoader library should support device_map to allocate tensor to different device.

device_map (Dict[str, Union[int, str, torch.device]], optional) — A map that specifies where each submodule should go. It doesn’t need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device.

Refs: https://huggingface.co/docs/accelerate/v1.0.1/en/package_reference/utilities#accelerate.utils.load_state_dict.device_map

Use Case

No response

Proposed Solution

infer_auto_device_map functions in accelerate could be helpful.

Nov 18 '24 09:11 brosoul

This task may be a prerequisite for https://github.com/aibrix/aibrix/issues/403

Nov 18 '24 09:11 brosoul

Let's put the issue that about streamLoader or performance optimization into later versions, like v0.3.0

Dec 24 '24 06:12 brosoul