aghyad-deeb comments

Results 4 comments of


                                            aghyad-deeb

Agentic RL Support in GPT-OSS

I want to point out that, both empirically and from looking at the code, it seems the HF implementation of Flex Attention supports sinks for both forward and backward passes,...

Agentic RL Support in GPT-OSS

I also want to point out that the latest released version of vLLM doesn't support LoRA for GPT OSS; it applies lora only to attention layers which causes discrepancy between...

When I run the main_eval.py script for evaluation, an error occurs: "ModuleNotFoundError: No module named 'custom_module'".

I was getting a similar error when doing grpo training and enabling async reward computation in the config. The following edits fixed it: Add this ```python reward_fn = load_reward_manager( config,...

Error when using `stopping_criteria` with `model.generate()`

@MichaelRipa Thanks for your reply! In case this is helpful, here's why I think it's particularly useful to have support for `StoppingCriteria`. When having the `remote=True` option, there seems to...