Arup De comments

Repositories
Issues
Comments

Results 2 comments of


                                            Arup De

Agentic RL Support in GPT-OSS

We investigated a critical compatibility issue where `flash_attention_2` doesn't support gpt-oss attention sink, causing gradient norm spikes during training. The existing VERL codebase lacked the ability to override the `attn_implementation`...

Agentic RL Support in GPT-OSS

@yinzhangyue it didn't implement the backward path with attention-sink support.