text
text copied to clipboard
Use consistent data_select default
Right now the default sometime is "train", "test", "valid" and sometimes (but more commonly) "train", "valid", "test". We should pick a single convention (this PR opts for the latter) to avoid confusion. This also affects one of the tests, which incorrectly assumed the latter was always the case.
NOTE: This also affects the BERT examples, which was built on top of assuming train, valid, test. NOTE: This also shows that WikiText103 isn't covered by tests. It's very large, but we should find a way of using a subset of the data to test this etc.
Codecov Report
Merging #995 into master will decrease coverage by
0.45%. The diff coverage is100.00%.
@@ Coverage Diff @@
## master #995 +/- ##
==========================================
- Coverage 78.27% 77.82% -0.46%
==========================================
Files 44 44
Lines 3126 3084 -42
==========================================
- Hits 2447 2400 -47
- Misses 679 684 +5
| Impacted Files | Coverage Δ | |
|---|---|---|
| ...rchtext/experimental/datasets/language_modeling.py | 81.96% <ø> (ø) |
|
| torchtext/experimental/datasets/translation.py | 76.81% <ø> (ø) |
|
| ...ext/experimental/datasets/raw/language_modeling.py | 80.00% <100.00%> (ø) |
|
| torchtext/experimental/transforms.py | 85.52% <0.00%> (-10.17%) |
:arrow_down: |
| ...htext/experimental/datasets/text_classification.py | 76.47% <0.00%> (+0.75%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 7e267d2...2426c5c. Read the comment docs.