text icon indicating copy to clipboard operation
text copied to clipboard

[RFC] Prototype pretrained models in torchtext

Open zhangguanheng66 opened this issue 4 years ago • 1 comments

This is a prototype pretrained XLM-R model based on the RoBERTa encoder. There are a few features that we would like to highlight and collect feedback:

  • The basic nn modules (e.g. TransformerEncoderLayer, PositionalEmbedding) are in the modules folder. Explicit args are passed to the constructor of the module class - see here.
  • A factory func xlmr_base is a simple interface to load the pretrained model + preprocessing transform + argu. The pretrained model (model.pt), preprocessing transform (sentencepiece.bpe.model and vocab.txt), and argu (args.json) are saved in the three separate files and will be downloaded automatically (via a tar.gz file) if they are not found. We should always keep this factory func an easy interface for user adoption.
  • The pretrained models are modularized. For example, in the sentence classification task, xlmr_base_sentence_classifier is used to load the pretrained transform encoder via xlmr_base_model and a pretrained classification head via _load_sentence_classifier. Similarly, the preprocessing transform pipeline is returned together with the pretrained model.

zhangguanheng66 avatar Jan 31 '21 23:01 zhangguanheng66

Codecov Report

Merging #1136 (f07f80c) into master (4b2dfb0) will decrease coverage by 2.79%. The diff coverage is 38.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1136      +/-   ##
==========================================
- Coverage   79.63%   76.84%   -2.80%     
==========================================
  Files          47       52       +5     
  Lines        3201     3247      +46     
==========================================
- Hits         2549     2495      -54     
- Misses        652      752     +100     
Impacted Files Coverage Δ
torchtext/experimental/models/utils.py 47.05% <25.00%> (-52.95%) :arrow_down:
torchtext/experimental/modules/transformer.py 28.00% <28.00%> (ø)
torchtext/experimental/models/xlmr_model.py 35.63% <35.63%> (ø)
torchtext/experimental/modules/embedding.py 39.13% <39.13%> (ø)
torchtext/experimental/models/xlmr_transform.py 56.25% <56.25%> (ø)
torchtext/experimental/models/__init__.py 100.00% <100.00%> (ø)
torchtext/experimental/modules/__init__.py 100.00% <100.00%> (ø)
...ext/experimental/datasets/raw/language_modeling.py 80.43% <0.00%> (-12.75%) :arrow_down:
...t/experimental/datasets/raw/text_classification.py 83.63% <0.00%> (-4.21%) :arrow_down:
torchtext/experimental/datasets/raw/translation.py 91.57% <0.00%> (-0.58%) :arrow_down:
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 791260f...e72ce94. Read the comment docs.

codecov[bot] avatar Feb 05 '21 21:02 codecov[bot]