transformers
transformers copied to clipboard
Add missing type hints
This issue is part of our Great Code Cleanup 2022. If you're interested in helping out, take a look at this thread, or come join us on Discord and talk with other contributors!
🚀 Add missing type hints
Type hints are used inconsistently in the transformers
repo across both TF and PT models, and it'd be nice to make them a complete, consistent thing for the core models, especially because we want to develop features that depend on them!
Guide to contributing:
- Ensure you've read our contributing guidelines 📜
- Claim your architecture(s) in this thread (ensure no one is working on it). It's 100% okay to only take the TensorFlow or PyTorch version of a model, if you're not familiar with both frameworks! It's also okay to claim multiple models and group those changes into a single PR! 🎯
- Implement the changes as in https://github.com/huggingface/transformers/pull/16057 or https://github.com/huggingface/transformers/pull/16074 (see the diff on the model architectures for a few examples) 💪
- Open the PR and tag me in it. You should run
make fixup
at the end to do a code quality check before your final commit!
Tips for making your PR
- The files you need to edit will be in
src/transformers/models/[model_name]/
- For TensorFlow, you want the
modeling_tf_[model_name].py
file. For PyTorch, you want themodeling_[model_name].py
file. - Remember, you do not have to cover every class in that file!. The main thing we want to cover is the
call
(for TF) orforward
(for PT) method for user-facing classes likeTFRobertaForMaskedLM
orRobertaForSequenceClassification
. It's not necessary to add type hints to layers or base classes likeRobertaModel
orTFRobertaPreTrainedModel
- these are trickier to write, and generally people do not use those classes as standalone models. - If you're unfamiliar with how type hints work, you can read the Python library documentation on them, but it's probably even easier to just look at another PR that added them. Take a look at the list of changes in the pull requests linked above!
- The types will usually be obvious - most inputs are
Optional[Union[np.ndarray, tf.Tensor]]
for TF models andOptional[torch.Tensor]
for PyTorch models, and boolean inputs areOptional[bool]
. Pay attention to the first input of TF models, though, which is usuallyTFModelInputType
- this is because Keras handles that first input in a special way! Other inputs to pay attention to arepast_key_values
, which can vary between models, and also the model output type. For the base model classes likeRobertaModel
, you may have to look at the correspondingMainLayer
to figure out the right output type! Also, note that the output type may be a tuple ifreturn_dict
is False, in which case you should specifyUnion[Tuple, ...]
. Finally, note that in TF models,training
is neverNone
, so it should betraining: bool
and nottraining: Optional[bool]
. - Note that some code is copied across our codebase. If you see a line like
# Copied from transformers.models.bert...
, this means that the code is copied from that source, and our scripts will automatically keep that in sync. If you see that, you should not edit the copied method! Instead, edit the original method it's copied from, and runmake fixup
to synchronize that across all the copies. Be sure you installed the development dependencies withpip install -e ".[dev"]
, as described in the contributor guidelines above, to ensure that the code quality tools inmake fixup
can run.
How can I find models that need type hints?
I used to maintain a list here, but it got out of date, I'm sorry. Instead, you can use this Colab notebook. If you run this, it will show you models in PyTorch or TF that are still missing type hints. Unlike my manually curated lists, it's guaranteed to be up to date - but do double-check that someone else in the thread hasn't claimed a model before you start, because the Colab code will only register type hints after the PR containing them is merged!
I would love to work on PyTorch Albert
🚀
Hi, I would like to work on PyTorch ImageGPT
Hi, I would like to work on CamemBERT
for PT & TF.
I will take a look at LayoutLMv2
after the first one :smiley:
Edit: Because CamemBert depends on Roberta
I will take PyTorch Roberta
:+1:
Hello!
I'd like to take Hubert
& Wav2Vec2
for Pytorch.
Cheers!
I'll try PyTorch BERT to start!
@johnryan465 I just did it as an example, I'm sorry! I'm marking off the completed models now.
@Rocketknight1 no worries, will try and do DistillBert instead
I'd like to work on GPT2 (TF).
@Rocketknight1 I switch to Roberta PyTorch because CamemBERT depends on Roberta modeling
Awesome! Hey @Rocketknight1 – I'd like to work on Longformer for both PyTorch & TF!
I'd like to work on BigBird
I would like to work on Clip for pytorch.
Also, will work on BeiT
, Deit
and ViT
(Pytorch)
I can work on ImageGPT.
I can work on Swin (Pytorch)
I'd like to work on XLM (Tensorflow)
I'll take T5 (Tensorflow)!
I'd like to claim GPT-2 (PyTorch).
Hi @Rocketknight1,
I would like to work on BART of both TF and PyTorch
ELECTRA TF - https://github.com/huggingface/transformers/pull/16104 ELECTRA PT - https://github.com/huggingface/transformers/pull/16103 DeBERTA PT - https://github.com/huggingface/transformers/pull/16105
XLMRobertaXL (PyTorch)
segformer pytorch
I'll take OpenAIGPT!
Hi @Rocketknight1,
I would like to work on BART of both TF and PyTorch
can you please confirm with emoji whether i am eligible to take these or not? @Rocketknight1
I will work on XLM (PyTorch)
@robotjellyzone You can! Please note that we accepted a PR yesterday to add the TF decorator to BART, so make sure you're working on the most recent version of the library before you start your PR!
I'll take Distilbert (TensorFlow)
Happy to take T5 (PyTorch)
@Rocketknight1 isn't the list missing ConvNext? If so, I'm happy to take care of that one too :ok_hand:
I'll work on GPTJ
@robotjellyzone You can! Please note that we accepted a PR yesterday to add the TF decorator to BART, so make sure you're working on the most recent version of the library before you start your PR!
OK sure! I will keep this in mind 😊👍...