liuqianchao comments

Repositories
Issues
Comments

Results 4 comments of


                                            liuqianchao

add Sequence Parallelism

Do we have any ETA to finish this PR? Sequence Parallelism is quite important for lots of long context LLM task training.

How to train eagle3 and support qwen2

> The training code has been released. @hongyanz Can you help update the `cnets.py` code to make it compatible with the Qwen model?

feat: Implement Sink Attention

@aoxy hi, any update on the merge work? We’ve recently run into low training efficiency when doing RL training with gpt-oss because sink attention support is inconsistent between training and...

gpt-oss implementation

how is the progress of support gpt-oss