will brown comments

Results 14 comments of


                                            will brown

[Question] Is vLLMRollout.generate_sequences the right place to implement tool calling?

if there's interest, would love to try and get this working with the [verifiers](https://github.com/willccbb/verifiers) repo i'm building, mostly focused on TRL so far (https://github.com/huggingface/trl/pull/2810) but hopefully the Environments can be...

Batch Processing Feature

Sorry, just saw this -- will take a swing when [#53](https://github.com/Blaizzy/mlx-vlm/pull/53) is merged.

feat(mlx_lm): support batch input in `generate()`

the code to implement batch inference is pretty simple, you can just copy the code from mlx_parallm (or probably better: https://github.com/N8python/gsm-mlx) and add it to your codebase. the hard part...

Added functionality in vf-eval to include or exclude specific IDs from dataset

@ParamThakkar123 this doesn't seem like it's doing any actual filtering at the dataset level, just adding fields?

Add Vision / VLM models to environments and GRPO trainer

nice! absolutely agree that we want to add VLMs eventually, just hasn't been at top of our priority list yet, though this implementation does look like a pretty nice starting...

feat(eval): add support for grouped reward summaries and reports

Ah we deprecated the report feature, sorry!

feat: add response format handling in judge rubric

@lakshyaag Can this not already be done by passing response_format via sampling_args? Would rather have that be the all-in-one route for expressing additional configurations rather than explicitly adding each one.

Plan to improve typing

ah, I should update the contributor guidance -- developing on Mac without the trainer-only extras is totally fine (I often do this myself), but the trainer-related imports won't be resolved...

Add support for saving intermediate results during vf-eval

nice! looks pretty good, updated to merge with latest main -- probably will make some other edits before merging, our logic for vf-eval outputs json saving has drifted a bit...

add GymEnv

nice! would you wanna do the PR on top of the trajectories branch? https://github.com/PrimeIntellect-ai/verifiers/pull/549 there's a bunch of new things which should make native gym-style rollouts much easier, especially for...