aghyad-deeb

Results 4 comments of aghyad-deeb

I want to point out that, both empirically and from looking at the code, it seems the HF implementation of Flex Attention supports sinks for both forward and backward passes,...

I also want to point out that the latest released version of vLLM doesn't support LoRA for GPT OSS; it applies lora only to attention layers which causes discrepancy between...

I was getting a similar error when doing grpo training and enabling async reward computation in the config. The following edits fixed it: Add this ```python reward_fn = load_reward_manager( config,...

@MichaelRipa Thanks for your reply! In case this is helpful, here's why I think it's particularly useful to have support for `StoppingCriteria`. When having the `remote=True` option, there seems to...