RL4LMs issues

Results 48 RL4LMs issues

Sort by recently updated

Problems with models that don't have the parallelize() function

Hey, First of all thank you for this amazing repo! I am trying to employ this repo with a model that is does not have the parallelize() function (led -...

lovodkin93

Implementing self-play

Hello I would like to implement self-play dialogue training. For that I guess I need to modify episode rollout process by adding formatting like speaker id on the start of...

eublefar

BART supervised

I have tried using BART as a seq2seq type model, from huggingface facebook/bart-large. This howerver throws an error saying that .parallelise doesnt exit. Has anyone been able to finetune bart...

talent404

Changed from logging with the root logger

Logging with the root logger, like `logging.info`, removes the possibility of controlling the log level of submodules separately. `logging.getLogger(__name__)` enables this (& is the recommended practice), by doing something like...

JulesGM

Just a warning that the package doesn't work with Transformers 4.25.1

Looks like generation_beam_constraints doesn't exist or has been moved?

JulesGM

Larger models like GPT-J and GPT-NeoX-20B

Has this library been tested with larger models such as GPT-J-6B and GPT-NeoX-20B? Are there plans to support larger models like these? Thanks.

loganlebanoff

100% likely that two function parameters have been merged by accident

https://github.com/allenai/RL4LMs/blob/main/rl4lms/envs/text_generation/policy/seq2seq_policy.py#L263 ![Screen Shot 2022-11-29 at 6 05 29 PM](https://user-images.githubusercontent.com/3231217/204667819-409cb407-726f-40d9-9d43-8eb0ef9617f5.png)

JulesGM

good first issue

code enhancement

Top-K and Top-p sampling

Hi, thanks for your great work! I have a question about the sampling process. When both top-K and top-p are enabled (e.g., https://github.com/allenai/RL4LMs/blob/main/scripts/training/task_configs/common_gen/t5_nlpo.yml#L44-L51), isn't top-p just ignored because the K...

boblee22

RL4LMs
RL4LMs copied to clipboard

Metadata

Problems with models that don't have the parallelize() function

Implementing self-play

BART supervised

Changed from logging with the root logger

Just a warning that the package doesn't work with Transformers 4.25.1

Larger models like GPT-J and GPT-NeoX-20B

100% likely that two function parameters have been merged by accident

Top-K and Top-p sampling

← Metadata

Owner

Metadata

RL4LMs RL4LMs copied to clipboard

Metadata

← Metadata

Owner

Metadata

RL4LMs
RL4LMs copied to clipboard