Add basic PEFT support to train script + record module
Hello from PEFT :wave:
What this does
This change introduces basic support for PEFT adapter methods (i.e. everything that has a rank parameter, LoRA, BONE, LoHa, LoKr, etc). If someone has a smarter idea how to handle other methods that don't have a rank parameter (such as (B)OFT or IA3) I'd be happy to implement that instead, but I think that this is a good first throw in terms of number of possible methods.
Another benefit of this is that remixing models is much easier since you only download/upload small adapters (~10MB) instead of whole models (~1GB) which is good for iteration, I suppose.
How it was tested
My focus was on enabling easy training with PEFT, so I've only tested training runs on a small, custom dataset. I tested SmolVLA and ACT and in both cases I was able to more than double batch size and use learning rates x10 compared to full fine-tuning to achieve much faster convergence.
Based on the small nature of the tests I'm sure that the defaults are sub-optimal but these can be updated once more thorough benchmarking was done.
How to checkout & try?
To train an adapter of a policy you can supply the --use_peft parameter (and possibly other, optional parameters):
$ python src/lerobot/scripts/train.py --policy.type=smolvla [...] --use_peft=true
Some PEFT parameters can be tuned (those that are most common among adapter methods):
$ python src/lerobot/scripts/train.py [...] --peft.r=32 --peft.target_modules='all-linear'
To use the model with the record module you supply the policy as usual but pass the --policy.use_peft=true flag as well:
$ python -m lerobot.record --policy.path=myrepo/mymodel --policy.use_peft=true
(technically, for local folders you don't need to supply policy.use_peft but currently we don't check for adapter_config.json in the remote repo, so auto-detection of a PEFT adapter for remote repos is not happening. would we want this?)
I think I've addressed all review comments.
Note that I've removed the predefined targets for ACT since I don't think that it makes sense to provide defaults for a policy that is supposed to be trained from scratch. If the user has a pretrained ACT policy they can select the target modules via CLI anyway.
Regarding the CLI, I've added a new test file that supports testing CLI commands. If you think this is a good idea I would also add a test checking whether something like peft.target_modules=foo is applied correctly. If we don't want to proceed with these tests that's OK as well :)
For now I think we're still OK with targeting q_proj|v_proj by default instead of mlp.*. If we find that there are better defaults it is easy to change.
I'm interested in using lora to fine-tune, so I'm trying this out. the first thing I hit was that peft doesn't seem to be installed as a dependency when I install via pip install -e .. when I installed it manually, I get an error:
ai@autonomyai:/autonomyai/lerobot$ lerobot-train --dataset.repo_id=test/v3 --policy.type=smolvla --output_dir=outputs/train/smolvla_so101_test --job_name=smolvla_so101_test --policy.device=cuda --wandb.enable=true --policy.repo_id=test/policy_box_smol --policy.use_peft=True --peft.method=lora --peft.r=32 --peft.target_module='all-linear'
INFO 2025-10-18 18:57:49 ot_train.py:228 {'batch_size': 8,
'dataset': {'episodes': None,
'image_transforms': {'enable': False,
'max_num_transforms': 3,
'random_order': False,
'tfs': {'brightness': {'kwargs': {'brightness': [0.8,
1.2]},
'type': 'ColorJitter',
'weight': 1.0},
'contrast': {'kwargs': {'contrast': [0.8,
1.2]},
'type': 'ColorJitter',
'weight': 1.0},
'hue': {'kwargs': {'hue': [-0.05,
0.05]},
'type': 'ColorJitter',
'weight': 1.0},
'saturation': {'kwargs': {'saturation': [0.5,
1.5]},
'type': 'ColorJitter',
'weight': 1.0},
'sharpness': {'kwargs': {'sharpness': [0.5,
1.5]},
'type': 'SharpnessJitter',
'weight': 1.0}}},
'repo_id': 'test/v3',
'revision': None,
'root': None,
'streaming': False,
'use_imagenet_stats': True,
'video_backend': 'torchcodec'},
'env': None,
'eval': {'batch_size': 50, 'n_episodes': 50, 'use_async_envs': False},
'eval_freq': 20000,
'job_name': 'smolvla_so101_test',
'log_freq': 200,
'num_workers': 4,
'optimizer': {'betas': [0.9, 0.95],
'eps': 1e-08,
'grad_clip_norm': 10,
'lr': 0.0001,
'type': 'adamw',
'weight_decay': 1e-10},
'output_dir': 'outputs/train/smolvla_so101_test',
'peft': {'full_training_modules': None,
'init_type': None,
'method_type': 'lora',
'r': 32,
'target_modules': 'all-linear'},
'policy': {'adapt_to_pi_aloha': False,
'add_image_special_tokens': False,
'attention_mode': 'cross_attn',
'chunk_size': 50,
'device': 'cuda',
'empty_cameras': 0,
'expert_width_multiplier': 0.75,
'freeze_vision_encoder': True,
'input_features': {},
'license': None,
'load_vlm_weights': False,
'max_action_dim': 32,
'max_period': 4.0,
'max_state_dim': 32,
'min_period': 0.004,
'n_action_steps': 50,
'n_obs_steps': 1,
'normalization_mapping': {'ACTION': <NormalizationMode.MEAN_STD: 'MEAN_STD'>,
'STATE': <NormalizationMode.MEAN_STD: 'MEAN_STD'>,
'VISUAL': <NormalizationMode.IDENTITY: 'IDENTITY'>},
'num_expert_layers': -1,
'num_steps': 10,
'num_vlm_layers': 16,
'optimizer_betas': [0.9, 0.95],
'optimizer_eps': 1e-08,
'optimizer_grad_clip_norm': 10,
'optimizer_lr': 0.0001,
'optimizer_weight_decay': 1e-10,
'output_features': {},
'pad_language_to': 'longest',
'prefix_length': -1,
'pretrained_path': None,
'private': None,
'push_to_hub': True,
'repo_id': 'test/policy_box_smol',
'resize_imgs_with_padding': [512, 512],
'scheduler_decay_lr': 2.5e-06,
'scheduler_decay_steps': 30000,
'scheduler_warmup_steps': 1000,
'self_attn_every_n_layers': 2,
'tags': None,
'tokenizer_max_length': 48,
'train_expert_only': True,
'train_state_proj': True,
'type': 'smolvla',
'use_amp': False,
'use_cache': True,
'use_delta_joint_actions_aloha': False,
'use_peft': True,
'vlm_model_name': 'HuggingFaceTB/SmolVLM2-500M-Video-Instruct'},
'resume': False,
'save_checkpoint': True,
'save_freq': 20000,
'scheduler': {'decay_lr': 2.5e-06,
'num_decay_steps': 30000,
'num_warmup_steps': 1000,
'peak_lr': 0.0001,
'type': 'cosine_decay_with_warmup'},
'seed': 1000,
'steps': 100000,
'use_policy_training_preset': True,
'wandb': {'disable_artifact': False,
'enable': True,
'entity': None,
'mode': None,
'notes': None,
'project': 'lerobot',
'run_id': None}}
/home/ai/.local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
/home/ai/.local/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
warnings.warn(
Logs will be synced with wandb.
INFO 2025-10-18 18:57:50 db_utils.py:103 Track this run --> https://wandb.ai/jnmacdnld/lerobot/runs/wxxwvoje
INFO 2025-10-18 18:57:50 ot_train.py:244 Creating dataset
INFO 2025-10-18 18:57:51 ot_train.py:255 Creating policy
config.json: 3.77kB [00:00, 474kB/s]
processor_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67.0/67.0 [00:00<00:00, 608kB/s]
chat_template.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 430/430 [00:00<00:00, 3.59MB/s]
preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 599/599 [00:00<00:00, 5.95MB/s]
tokenizer_config.json: 28.6kB [00:00, 79.3MB/s]
vocab.json: 801kB [00:00, 7.10MB/s]
merges.txt: 466kB [00:00, 3.68MB/s]
tokenizer.json: 3.55MB [00:00, 33.3MB/s]
added_tokens.json: 4.74kB [00:00, 12.9MB/s]
special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 868/868 [00:00<00:00, 7.75MB/s]
You have video processor config saved in `preprocessor.json` file which is deprecated. Video processor configs should be saved in their own `video_preprocessor.json` file. You can rename the file or load and save the processor back which renames it automatically. Loading from `preprocessor.json` will be removed in v5.0.
Reducing the number of VLM layers to 16 ...
INFO 2025-10-18 18:58:10 ot_train.py:262 Using PEFT! Wrapping model.
Traceback (most recent call last):
File "/home/ai/.local/bin/lerobot-train", line 7, in <module>
sys.exit(main())
File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 455, in main
train()
File "/autonomyai/lerobot/src/lerobot/configs/parser.py", line 225, in wrapper_inner
response = fn(cfg, *args, **kwargs)
File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 263, in train
policy = wrap_policy_in_peft_model(cfg, policy)
File "/autonomyai/lerobot/src/lerobot/scripts/lerobot_train.py", line 203, in wrap_policy_in_peft_model
policy = get_peft_model(
File "/home/ai/.local/lib/python3.10/site-packages/peft/mapping_func.py", line 115, in get_peft_model
return PeftModel(
File "/home/ai/.local/lib/python3.10/site-packages/peft/peft_model.py", line 130, in __init__
self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 209, in __init__
self.inject_adapter(self.model, adapter_name, low_cpu_mem_usage=low_cpu_mem_usage, state_dict=state_dict)
File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 691, in inject_adapter
tied_target_modules = self._get_tied_target_modules(model=model)
File "/home/ai/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 868, in _get_tied_target_modules
if model_config.get("tie_word_embeddings"):
AttributeError: 'SmolVLAConfig' object has no attribute 'get'
it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?
it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?
@githubnemo Does this require the latest PEFT installed from source?
I'm interested in using lora to fine-tune, so I'm trying this out. the first thing I hit was that peft doesn't seem to be installed as a dependency when I install via
pip install -e .. when I installed it manually, I get an error:
...
it looks like peft is trying to use the config object as a dict, but it is not a dict. is there a specific version of peft that needs to be installed?
Yeah, this change depends on https://github.com/huggingface/peft/pull/2778 which is not yet released so you need to install the latest PEFT version from source. See here: https://huggingface.co/docs/peft/en/install#source