trl
trl copied to clipboard
Wandb - Selected runs are not logging media for the key objective/*, but instead are logging values of type list.
Getting this error in wandb when running either:
./examples/sentiment/scripts/gpt2-sentiment.py
./examples/sentiment/scripts/t5-sentiment.py
It looks like the data are lists of tensors. From the TRL Showcase it should be displaying histograms?
EDIT: I'm using accelerate across 4 GPUs, but the lists look to be of length 256, which at a guess is the batch size?
Yes, usually list of tensors are converted to histograms automatically (see rewards or ratios). Not sure what happened there but if you want to take a look that would be much appreciated.
I'm not too familiar with Wandb, so I had wondered if it could handle a list of tensors. env/reward_dist
is rendering correctly as a histogram which is what made me suspect that tensors weren't supported.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.