trl icon indicating copy to clipboard operation
trl copied to clipboard

update in DPO raise several problems...

Open holarissun opened this issue 1 year ago • 2 comments

  1. get_hh was removed. There is no dataset now.

  2. from trl.commands.cli_utils import DpoScriptArguments, init_zero_verbose, TrlParser: No module named 'trl.commands'

holarissun avatar Mar 18 '24 21:03 holarissun

Those changes are currently only on main, did you install TRL from source?

lvwerra avatar Mar 20 '24 08:03 lvwerra

Would you like to give #1456 a try?

python examples/scripts/dpo.py \
    --dataset_name=trl-internal-testing/hh-rlhf-trl-style \
    --model_name_or_path=gpt2 \
    --per_device_train_batch_size 4 \
    --max_steps 1000 \
    --learning_rate 1e-3 \
    --gradient_accumulation_steps 1 \
    --logging_steps 10 \
    --eval_steps 500 \
    --output_dir="dpo_anthropic_hh" \
    --warmup_steps 150 \
    --report_to wandb \
    --bf16 \
    --logging_first_step \
    --no_remove_unused_columns

vwxyzjn avatar Mar 20 '24 14:03 vwxyzjn

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions[bot] avatar Apr 18 '24 15:04 github-actions[bot]

trl v0.13.0

from trl.commands.cli_utils import TrlParser

error:

ModuleNotFoundError: No module named 'trl.commands'

steveepreston avatar Jan 04 '25 08:01 steveepreston

Solution:

from trl.scripts.utils import TrlParser

steveepreston avatar Jan 04 '25 10:01 steveepreston