trlx issues

Use tiny models for the tests

### 🚀 The feature, motivation, and pitch Using tiny models for the tests may speed up the tests by a factor of 2 or 3, while still effectively verifying the...

glerzing

feature request

Support for CodeGenForCausalLM

3

### 🚀 The feature, motivation, and pitch I'm trying to apply RL in a code generation LM: https://huggingface.co/docs/transformers/model_doc/codegen, unfortunately getting the error below: ``` ValueError: Unsupported architecture: `CodeGenForCausalLM`. The following...

ZhengYang

feature request

Support logging for non-scalar metrics

2

### 🚀 The feature, motivation, and pitch AccelerateRLTrainer.evaluate() logs a table of generated eval outputs and metrics to the metrics tracker. If I understand correctly, only scalar metrics are currently...

g-simmons

feature request

Add Stable Vicuna Training

PhungVanDuy

Release examples of training stablevicuna

3

### 🚀 The feature, motivation, and pitch Very exciting to see you guys' remarkable effort for open source this repo! And I read through stablevicuna blog and notice that the...

REIGN12

feature request

Migrate to `peft` from `opendelta` for parameter efficient tuning methods

6

### 🚀 The feature, motivation, and pitch Let's migrate to [`peft`](https://github.com/huggingface/peft). ##### Tasks Doing so will require the following updates: 1. Replace the `opendelta` setup in the `AccelerateBaseTrainer` with a...

jon-tow

feature request

Issues in experience rollout generation with FLAN-T5

2

### 🐛 Describe the bug When the `trlX` trainer makes a call to `model.generate` in the rollout phase, the process errors out with the following message: ```RuntimeError: probability tensor contains...

abarbet

bug

TPU Integration

3

### 🚀 The feature, motivation, and pitch trlX uses HuggingFace accelerate under the hood. Accelerate has the capability to leverage Google's TPUs for faster training. I'm interested in supporting trlX...

steventk-g

feature request

missing pad_token error when using GPT2Chinese

2

### 🐛 Describe the bug When i run the code in summarization-rlhf using GPT2Chinese, the following error occurs . I have checked the "specail_tokens_map.json", it does have the "[PAD]" token....

yxk9810

bug

Precompute logprobs, values

https://github.com/CarperAI/trlx/blob/9fdd0d757e8f7a3d48e7edb060ddb7517da13d2d/trlx/trainer/accelerate_ppo_trainer.py#L399 I met that error when precompute logprobs, values due to the concatenation of prompt_tensor and output . my question is why we concat the prompt_tensor and output in the...

weizhenzhao

trlx
trlx copied to clipboard

Metadata

Use tiny models for the tests

Support for CodeGenForCausalLM

Support logging for non-scalar metrics

Add Stable Vicuna Training

Release examples of training stablevicuna

Migrate to `peft` from `opendelta` for parameter efficient tuning methods

Issues in experience rollout generation with FLAN-T5

TPU Integration

missing pad_token error when using GPT2Chinese

Precompute logprobs, values

← Metadata

Owner

Metadata

trlx trlx copied to clipboard

Metadata

← Metadata

Owner

Metadata

trlx
trlx copied to clipboard