transformerlab-app icon indicating copy to clipboard operation
transformerlab-app copied to clipboard

Add GPU Split Strategy for Inference similar to LM Studio

Open deep1401 opened this issue 1 month ago • 0 comments

A user Frank talked to mentioned they primarily like the GPU splitting feature when doing inference with models in LM Studio. We should add that as well. This may involve doing work with transformerlab-inference. We could start with fastchat_server supporting this and then propagating it to other loader plugins.

Reference:

Image

deep1401 avatar Nov 14 '25 20:11 deep1401