transformers issues

fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP is not working with the Trainer

5

### System Info acc_cfg.yml: compute_environment: LOCAL_MACHINE debug: false distributed_type: FSDP downcast_bf16: 'no' enable_cpu_affinity: true fsdp_config: fsdp_activation_checkpointing: true fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP fsdp_backward_prefetch: NO_PREFETCH fsdp_cpu_ram_efficient_loading: true fsdp_forward_prefetch: true fsdp_offload_params: true fsdp_sharding_strategy: FULL_SHARD fsdp_state_dict_type:...

eljandoubi

trainer

bug

PyTorch FSDP

Finetuned LLAMA Model is working same old pretrained model after combining LORA weights with old model

2

I have finetuned a LLAMA-7b-chat-hf model and saved the adaptor weights. After Loading and merging the adaptor weights to old model, both new model and old model was giving me...

abhi201002

Add propainter

34

# What does this PR do? This PR adds ProPainter, a Video Inpainting model with 5.4k stars and 635 forks [repo](https://github.com/sczhou/ProPainter). It fixes #26360 and resolve stale PR #26391 for...

RUFFY-369

run-slow

Add GGUF for Mamba

1

# What does this PR do? Add GGUF support for Mamba ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the...

VladOS95-cyber

Add ColPali to 🤗 transformers

# What does this PR do? Add [ColPali](https://doi.org/10.48550/arXiv.2407.01449) support in 🤗 `transformers`. ## Who can review? @yonigozlan 😉 ## Additional details - This PR uses the new [Modular 🤗 transformers](https://huggingface.co/docs/transformers/main/en/modular_transformers#modular-transformers)...

tonywu71

Add Segment Anything 2 (SAM2)

9

# What does this PR do? https://github.com/huggingface/transformers/issues/32308 As stated in this issue this PR is making SAM2 compatible to transformers cc. @zinccat @RUFFY-369 Fixes # (issue) ## Before submitting -...

SangbumChoi

New model

Vision

run-slow

fix loss scaling only when compute_loss_func is used

# What does this PR do? In [#34198](https://github.com/huggingface/transformers/commit/6ba31a8a94bf7cfeaf59ffc3bc9e0b0cd3e25788#diff-ed55888e6665791fe92cc8fc0c499da54f4ace6738551cd9a2591881cda076deR3629), the line `loss *= self.args.gradient_accumulation_steps` was introduced due to `Negate accelerate grad accum div`. This change was made to correct errors encountered...

BlackNoodle

New GA fix causes training loss multiple times higher across the board (5x to 10x higher)

1

### System Info 8xH100 ### Who can help? _No response_ ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [...

JianbangZ

bug

Request for a clear documentation for .generate()

2

### Feature request The `.generate()` function has a lot of parameters, for example `length_penalty` and `diversity_penalty`. However, the [documentation](https://huggingface.co/docs/transformers/v4.45.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) of this function does not document a full list of parameters,...

sedol1339

Feature request

fix error in _get_eval_sampler when group_by_length enabled

2

# What does this PR do? https://github.com/huggingface/transformers/pull/33514 this PR adds group_by_length support for evaluation. But, this part uses self.eval_dataset instead of eval_dataset. So, if eval_dataset is dictionary it fails. ##...

akakakakakaa

transformers
transformers copied to clipboard

Metadata

fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP is not working with the Trainer

Finetuned LLAMA Model is working same old pretrained model after combining LORA weights with old model

Add propainter

Add GGUF for Mamba

Add ColPali to 🤗 transformers

Add Segment Anything 2 (SAM2)

fix loss scaling only when compute_loss_func is used

New GA fix causes training loss multiple times higher across the board (5x to 10x higher)

Request for a clear documentation for .generate()

fix error in _get_eval_sampler when group_by_length enabled

← Metadata

Owner

Metadata

transformers transformers copied to clipboard

Metadata

← Metadata

Owner

Metadata

transformers
transformers copied to clipboard