AMDMIGraphX
AMDMIGraphX copied to clipboard

Published 20 hours ago •

ROCmSoftwarePlatform

Reame
Issues

Add weight streaming at runtime

Open eddieliao opened this issue 8 months ago • 7 comments

Figure out a way to have weight streaming at runtime i.e. be able to fit large models on gpu without needing to know literal size ahead of time

[x] Define/determine an allocation of literals to be streamed
[x] Move copy instructions to separate stream
[x] Investigate why @literal instructions take up so much time
[x] Decrease time spent on @literal instruction
[ ] Move from naive allocation to "smart" allocation (figure out best way to mask time taken to copy)

Jun 05 '24 19:06 eddieliao

Labels

enhancement

Windows

Ubuntu

UAI

Owner

Other Repo Issues