torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

[Feature request] Support CPU+GPU mixed execution

Open malfet opened this issue 10 months ago • 1 comments

Assumption right now, it's only needed when there is not enough GPU memory, but perhaps sometimes it's just faster this way

Right now we only doing tokenization on CPU and inference can run on either CPU or GPU

malfet avatar Mar 28 '24 22:03 malfet