Josh Meyer

Results 152 issues of Josh Meyer

would be helpful if readme said whether or not CPU usage is possible, and if so, how

## Summary This PR enables proper GRPO training with importance sampling when using offline trajectory data (e.g., from vLLM traces). It includes three complementary fixes: ### 1. Extract logprobs from...