Matthias Gerstgrasser comments

Results 17 comments of


                                            Matthias Gerstgrasser

[2.0rc1][nightly-test] long_running_distributed_pytorch_pbt_failure failed

I got a possibly similar error just now on a distributed `tune.run()` / RLlib run. Is this the same issue? Any workaround? @matthewdeng ``` Traceback (most recent call last): File...

Is left-padding in PPO strictly necessary?

Ah, you mean `remove_padding_in_sequences()`? Wouldn't that still work with only right-padding?

Is left-padding in PPO strictly necessary?

Ah, no, to be clear, what I mean is the following: Right now, the padding is done like this ('promp' - a prompt token, 'respo' - a response token): ```...

Is left-padding in PPO strictly necessary?

That's not what I am proposing though! What I mean is, if I return it without the pads in the middle from `_generate_vllm()`, would that break anything? (No worries if...

Forced EOS token in vllm generation?

Ahhhh, got it, that makes sense. I think that's probably broken with local generation then! I just verified that that doesn't have EOS if max_tokens is reached. Also, would taking...

Forced EOS token in vllm generation?

> I am not sure which approach to take at the moment, but our current implementation is heavily dependent on EOS tokens. You mean specifically for the RM? Or more...

Forced EOS token in vllm generation?

Oh, for local generation, does `actor.process_sequences()` do the same thing? https://github.com/OpenLLMAI/OpenRLHF/blob/bed10e115523a9eca419cb058ede8e531d23c182/openrlhf/models/actor.py#L159 If so, then doing this in `RemoteExperienceMaker` seems unnecessary, since that also calls `actor.process_sequences()` later anyway, i.e. this is...

Matthias Gerstgrasser

[2.0rc1][nightly-test] long_running_distributed_pytorch_pbt_failure failed

Is left-padding in PPO strictly necessary?

Is left-padding in PPO strictly necessary?

Is left-padding in PPO strictly necessary?

Forced EOS token in vllm generation?

Forced EOS token in vllm generation?

Forced EOS token in vllm generation?

Forced EOS token in vllm generation?

Custom ExperienceMaker

Custom ExperienceMaker