v0.5 Performance Improvement on Long Max-Seqlen
Tracking items to improve performance for long max-seqlen workloads, including training and generation performance at long context.
@youngeunkwon0405 you can create sub-issues once you identified concrete items; If they are general items not async specific you can assign them to me.
Sure I will do.
There is no items planned for v0.5 actually. We will plan for v0.6.
Just for the logging purpose, I will share some knowledge I have found so far. https://docs.google.com/presentation/d/1U-Gd6gMmH6QX3KPzqwcyAFbK5Uir7XiDXe5j59EsP1s/edit?usp=sharing
@guyueh1 you moved this one to no action necessary. is there anything remaining to do on this issue?
@terrykong No features or fixes will be ready for v0.5, we will plan for v0.6 if that's alright