transformers
transformers copied to clipboard
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
# What does this PR do? When working on https://github.com/huggingface/transformers/pull/38635, I found that there are some models which have `past_key_values` in their signature, even though they cannot generate. The reason...
# What does this PR do? Replace typing.{List,Tuple,Dict} with {list, tuple,dict}. Due to the large amount of all changes, they are split into smaller PRs.
Hi @ArthurZucker and the HF team, I've been exploring the transformers library implementation of GPT-2 and noticed something interesting regarding the embedding dropout mechanism. I wanted to share an observation...
# What does this PR do? add profiler to trainer related issue: https://github.com/huggingface/transformers/issues/36360#issuecomment-2844195069
### System Info transformers version: 4.52.4 pytorch version: 2.6 ### Who can help? transformers version: 4.52.4 pytorch version: 2.6 When running Llama4 with tensor parallel, [torch.nn.Unfold used in llama4 ](https://github.com/huggingface/transformers/blob/v4.52.4/src/transformers/models/llama4/modeling_llama4.py#L1320)...
### System Info File "/anaconda/envs/openrlhf/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 767, in __getitem__ model_type = self._reverse_config_mapping[key.__name__] KeyError: 'Qwen2RMConfig' transformers version is 4.51.3 ### Who can help? _No response_ ### Information - [ ] The...
## Summary This PR adds support for the Arcee model architecture, laying the groundwork for the upcoming Arcee Foundation Model (AFM) release. Arcee is a decoder-only transformer model based on...
# What does this PR do? This PR fixes a bug in beam search generation where early stopping heuristics (when `early_stopping=False`) was incorrectly applied across the entire batch, instead of...
# What does this PR do? `datasets` introduce `trust_remote_code` at some point (probably 2024/09), but RAG's modeling code isn't handling this, and we get ``` ValueError: The repository for wiki_dpr...
Does it support Conv2d? Here's my complete script: ``` import os import argparse import torch from torch.utils.data import Dataset, DataLoader from torchvision import transforms from PIL import Image from transformers...