lerobot icon indicating copy to clipboard operation
lerobot copied to clipboard

Add basic PEFT support to train script + record module

Open githubnemo opened this issue 5 months ago • 4 comments

Hello from PEFT :wave:

What this does

This change introduces basic support for PEFT adapter methods (i.e. everything that has a rank parameter, LoRA, BONE, LoHa, LoKr, etc). If someone has a smarter idea how to handle other methods that don't have a rank parameter (such as (B)OFT or IA3) I'd be happy to implement that instead, but I think that this is a good first throw in terms of number of possible methods.

Another benefit of this is that remixing models is much easier since you only download/upload small adapters (~10MB) instead of whole models (~1GB) which is good for iteration, I suppose.

How it was tested

My focus was on enabling easy training with PEFT, so I've only tested training runs on a small, custom dataset. I tested SmolVLA and ACT and in both cases I was able to more than double batch size and use learning rates x10 compared to full fine-tuning to achieve much faster convergence.

Based on the small nature of the tests I'm sure that the defaults are sub-optimal but these can be updated once more thorough benchmarking was done.

How to checkout & try?

To train an adapter of a policy you can supply the --use_peft parameter (and possibly other, optional parameters):

$ python src/lerobot/scripts/train.py --policy.type=smolvla [...] --use_peft=true

Some PEFT parameters can be tuned (those that are most common among adapter methods):

$ python src/lerobot/scripts/train.py [...] --peft.r=32 --peft.target_modules='all-linear'

To use the model with the record module you supply the policy as usual but pass the --policy.use_peft=true flag as well:

$ python -m lerobot.record --policy.path=myrepo/mymodel --policy.use_peft=true

(technically, for local folders you don't need to supply policy.use_peft but currently we don't check for adapter_config.json in the remote repo, so auto-detection of a PEFT adapter for remote repos is not happening. would we want this?)

githubnemo avatar Jun 30 '25 13:06 githubnemo

I think I've addressed all review comments.

Note that I've removed the predefined targets for ACT since I don't think that it makes sense to provide defaults for a policy that is supposed to be trained from scratch. If the user has a pretrained ACT policy they can select the target modules via CLI anyway.

Regarding the CLI, I've added a new test file that supports testing CLI commands. If you think this is a good idea I would also add a test checking whether something like peft.target_modules=foo is applied correctly. If we don't want to proceed with these tests that's OK as well :)

For now I think we're still OK with targeting q_proj|v_proj by default instead of mlp.*. If we find that there are better defaults it is easy to change.

githubnemo avatar Oct 16 '25 14:10 githubnemo

I'm interested in using lora to fine-tune, so I'm trying this out. the first thing I hit was that peft doesn't seem to be installed as a dependency when I install via pip install -e .. when I installed it manually, I get an error:

ai@autonomyai:/autonomyai/lerobot$ lerobot-train   --dataset.repo_id=test/v3   --policy.type=smolvla   --output_dir=outputs/train/smolvla_so101_test   --job_name=smolvla_so101_test   --policy.device=cuda   --wandb.enable=true   --policy.repo_id=test/policy_box_smol --policy.use_peft=True  --peft.method=lora --peft.r=32 --peft.target_module='all-linear'
INFO 2025-10-18 18:57:49 ot_train.py:228 {'batch_size': 8,
 'dataset': {'episodes': None,
             'image_transforms': {'enable': False,
                                  'max_num_transforms': 3,
                                  'random_order': False,
                                  'tfs': {'brightness': {'kwargs': {'brightness': [0.8,
                                                                                   1.2]},
                                                         'type': 'ColorJitter',
                                                         'weight': 1.0},
                                          'contrast': {'kwargs': {'contrast': [0.8,
                                                                               1.2]},
                                                       'type': 'ColorJitter',
                                                       'weight': 1.0},
                                          'hue': {'kwargs': {'hue': [-0.05,
                                                                     0.05]},
                                                  'type': 'ColorJitter',
                                                  'weight': 1.0},
                                          'saturation': {'kwargs': {'saturation': [0.5,
                                                                                   1.5]},
                                                         'type': 'ColorJitter',
                                                         'weight': 1.0},
                                          'sharpness': {'kwargs': {'sharpness': [0.5,
                                                                                 1.5]},
                                                        'type': 'SharpnessJitter',
                                                        'weight': 1.0}}},
             'repo_id': 'test/v3',
             'revision': None,
             'root': None,
             'streaming': False,
             'use_imagenet_stats': True,
             'video_backend': 'torchcodec'},
 'env': None,
 'eval': {'batch_size': 50, 'n_episodes': 50, 'use_async_envs': False},
 'eval_freq': 20000,
 'job_name': 'smolvla_so101_test',
 'log_freq': 200,
 'num_workers': 4,
 'optimizer': {'betas': [0.9, 0.95],
               'eps': 1e-08,
               'grad_clip_norm': 10,
               'lr': 0.0001,
               'type': 'adamw',
               'weight_decay': 1e-10},
 'output_dir': 'outputs/train/smolvla_so101_test',
 'peft': {'full_training_modules': None,
          'init_type': None,
          'method_type': 'lora',
          'r': 32,
          'target_modules': 'all-linear'},
 'policy': {'adapt_to_pi_aloha': False,
            'add_image_special_tokens': False,
            'attention_mode': 'cross_attn',
            'chunk_size': 50,
            'device': 'cuda',
            'empty_cameras': 0,
            'expert_width_multiplier': 0.75,
            'freeze_vision_encoder': True,
            'input_features': {},
            'license': None,
            'load_vlm_weights': False,
            'max_action_dim': 32,
            'max_period': 4.0,
            'max_state_dim': 32,
            'min_period': 0.004,
            'n_action_steps': 50,
            'n_obs_steps': 1,
            'normalization_mapping': {'ACTION': <NormalizationMode.MEAN_STD: 'MEAN_STD'>,
                                      'STATE': <NormalizationMode.MEAN_STD: 'MEAN_STD'>,
                                      'VISUAL': <NormalizationMode.IDENTITY: 'IDENTITY'>},
            'num_expert_layers': -1,
            'num_steps': 10,
            'num_vlm_layers': 16,
            'optimizer_betas': [0.9, 0.95],
            'optimizer_eps': 1e-08,
            'optimizer_grad_clip_norm': 10,
            'optimizer_lr': 0.0001,
            'optimizer_weight_decay': 1e-10,
            'output_features': {},
            'pad_language_to': 'longest',
            'prefix_length': -1,
            'pretrained_path': None,
            'private': None,
            'push_to_hub': True,
            'repo_id': 'test/policy_box_smol',
            'resize_imgs_with_padding': [512, 512],
            'scheduler_decay_lr': 2.5e-06,
            'scheduler_decay_steps': 30000,
            'scheduler_warmup_steps': 1000,
            'self_attn_every_n_layers': 2,
            'tags': None,
            'tokenizer_max_length': 48,
            'train_expert_only': True,
            'train_state_proj': True,
            'type': 'smolvla',
            'use_amp': False,
            'use_cache': True,
            'use_delta_joint_actions_aloha': False,
            'use_peft': True,
            'vlm_model_name': 'HuggingFaceTB/SmolVLM2-500M-Video-Instruct'},
 'resume': False,
 'save_checkpoint': True,
 'save_freq': 20000,
 'scheduler': {'decay_lr': 2.5e-06,
               'num_decay_steps': 30000,
               'num_warmup_steps': 1000,
               'peak_lr': 0.0001,
               'type': 'cosine_decay_with_warmup'},
 'seed': 1000,
 'steps': 100000,
 'use_policy_training_preset': True,
 'wandb': {'disable_artifact': False,
           'enable': True,
           'entity': None,
           'mode': None,
           'notes': None,
           'project': 'lerobot',
           'run_id': None}}
/home/ai/.local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
  warnings.warn(
/home/ai/.local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
  warnings.warn(
Logs will be synced with wandb.
INFO 2025-10-18 18:57:50 db_utils.py:103 Track this run --> https://wandb.ai/jnmacdnld/lerobot/runs/wxxwvoje
INFO 2025-10-18 18:57:50 ot_train.py:244 Creating dataset
INFO 2025-10-18 18:57:51 ot_train.py:255 Creating policy
config.json: 3.77kB [00:00, 474kB/s]
processor_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67.0/67.0 [00:00<00:00, 608kB/s]
chat_template.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 430/430 [00:00<00:00, 3.59MB/s]
preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 599/599 [00:00<00:00, 5.95MB/s]
tokenizer_config.json: 28.6kB [00:00, 79.3MB/s]
vocab.json: 801kB [00:00, 7.10MB/s]
merges.txt: 466kB [00:00, 3.68MB/s]
tokenizer.json: 3.55MB [00:00, 33.3MB/s]
added_tokens.json: 4.74kB [00:00, 12.9MB/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 868/868 [00:00<00:00, 7.75MB/s]
You have video processor config saved in `preprocessor.json` file which is deprecated. Video processor configs should be saved in their own `video_preprocessor.json` file. You can rename the file or load and save the processor back which renames it automatically. Loading from `preprocessor.json` will be removed in v5.0.
Reducing the number of VLM layers to 16 ...
INFO 2025-10-18 18:58:10 ot_train.py:262 Using PEFT! Wrapping model.
Traceback (most recent call last):
  File "/home/ai/.local/bin/lerobot-train", line 7, in <module>
    sys.exit(main())
  File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 455, in main
    train()
  File "/autonomyai/lerobot/src/lerobot/configs/parser.py", line 225, in wrapper_inner
    response = fn(cfg, *args, **kwargs)
  File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 263, in train
    policy = wrap_policy_in_peft_model(cfg, policy)
  File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 203, in wrap_policy_in_peft_model
    policy = get_peft_model(
  File "/home/ai/.local/lib/python3.10/site-packages/peft/mapping_func.py", line 115, in get_peft_model
    return PeftModel(
  File "/home/ai/.local/lib/python3.10/site-packages/peft/peft_model.py", line 130, in __init__
    self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
  File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 209, in __init__
    self.inject_adapter(self.model, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage, state_dict=state_dict)
  File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 691, in inject_adapter
    tied_target_modules = self._get_tied_target_modules(model=model)
  File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 868, in _get_tied_target_modules
    if model_config.get("tie_word_embeddings"):
AttributeError: 'SmolVLAConfig' object has no attribute 'get'

it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?

jnmacdnld avatar Oct 18 '25 19:10 jnmacdnld

it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?

@githubnemo Does this require the latest PEFT installed from source?

BenjaminBossan avatar Oct 23 '25 10:10 BenjaminBossan

I'm interested in using lora to fine-tune, so I'm trying this out. the first thing I hit was that peft doesn't seem to be installed as a dependency when I install via pip install -e .. when I installed it manually, I get an error:

...

it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?

Yeah, this change depends on https://github.com/huggingface/peft/pull/2778 which is not yet released so you need to install the latest PEFT version from source. See here: https://huggingface.co/docs/peft/en/install#source

githubnemo avatar Oct 23 '25 16:10 githubnemo