trl
trl copied to clipboard
Error while running base script available in README.md
Hi,
I am trying to run the base script available on the readme file:
import torch
from transformers import AutoTokenizer
from trl import PPOTrainer, PPOConfig, AutoModelForCausalLMWithValueHead, create_reference_model
from trl.core import respond_to_batch
# get models
model = AutoModelForCausalLMWithValueHead.from_pretrained('gpt2')
model_ref = create_reference_model(model)
tokenizer = AutoTokenizer.from_pretrained('gpt2')
tokenizer.pad_token = tokenizer.eos_token
# initialize trainer
ppo_config = PPOConfig(
batch_size=1,
)
# encode a query
query_txt = "This morning I went to the "
query_tensor = tokenizer.encode(query_txt, return_tensors="pt")
# get model response
response_tensor = respond_to_batch(model_ref, query_tensor)
# create a ppo trainer
ppo_trainer = PPOTrainer(ppo_config, model, model_ref, tokenizer)
# define a reward for response
# (this could be any reward such as human feedback or output from another model)
reward = [torch.tensor(1.0)]
# train model for one step with ppo
train_stats = ppo_trainer.step([query_tensor[0]], [response_tensor[0]], reward)
Error:
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Traceback (most recent call last):
File "PPO.py", line 34, in <module>
train_stats = ppo_trainer.step([query_tensor[0]], [response_tensor[0]], reward)
File "/<PATH_TO_FOLDER>/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py", line 476, in step
train_stats = self.train_minibatch(
File "/<PATH_TO_FOLDER>/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py", line 681, in train_minibatch
loss_p, loss_v, train_stats = self.loss(old_logprobs, values, rewards, logits, vpreds, logprobs, mask)
File "/<PATH_TO_FOLDER>/lib/python3.8/site-packages/trl/trainer/ppo_trainer.py", line 767, in loss
advantages = masked_whiten(advantages, mask)
File "/<PATH_TO_FOLDER>/lib/python3.8/site-packages/trl/core.py", line 125, in masked_whiten
mean, var = masked_mean(values, mask), masked_var(values, mask)
File "/<PATH_TO_FOLDER>/lib/python3.8/site-packages/trl/core.py", line 109, in masked_mean
return (values * mask).sum(axis=axis) / mask.sum(axis=axis)
RuntimeError: Please look up dimensions by name, got: name = None.
How do I resolve it?
Hi @prakamya-mishra
Thanks for the issue
Can you please update trl ? pip install --upgrade trl should be solved in https://github.com/lvwerra/trl/pull/190
Closing this for now. Feel free to reopen if issue persists.