AMDMIGraphX icon indicating copy to clipboard operation
AMDMIGraphX copied to clipboard

Add weight streaming at runtime

Open eddieliao opened this issue 8 months ago • 7 comments

Figure out a way to have weight streaming at runtime i.e. be able to fit large models on gpu without needing to know literal size ahead of time

  • [x] Define/determine an allocation of literals to be streamed
  • [x] Move copy instructions to separate stream
  • [x] Investigate why @literal instructions take up so much time
  • [x] Decrease time spent on @literal instruction
  • [ ] Move from naive allocation to "smart" allocation (figure out best way to mask time taken to copy)

eddieliao avatar Jun 05 '24 19:06 eddieliao