text issues

find_match utility function to unify a few lines of code

1

find_match searches a list of strings and returns first entry that partially or fully contains the given string match.

cpuhrsch

cla signed

Use _PATHS for IMDB

1

This PR changes the IMDB download to actually use the filename stored to detect whether the data has already been downloaded. This can further prevent unnecessary querying of google drive.

cpuhrsch

cla signed

[WIP] Build docs in parallel

1

This can reduce build time

cpuhrsch

cla signed

[RFC] Prototype pretrained models in torchtext

1

This is a prototype pretrained XLM-R model based on the RoBERTa encoder. There are a few features that we would like to highlight and collect feedback: - The basic nn...

zhangguanheng66

cla signed

Add sentencepiece to BERT model

1

In the [XLM-R](https://arxiv.org/pdf/1911.02116.pdf) model, SentencePiece is used to tokenize the strings. We enable the sentencepiece processing pipeline here for the BERT workflow.

zhangguanheng66

cla signed

Add setitem func to torchtext.experimental.vocab.Vocab

1

On top of https://github.com/pytorch/text/pull/1027 Add `__setitem__` func to torchtext.experimental.vocab.Vocab. A `__delitem__` func is added as well. [RuntimeError] if the token exists, a error message is sent out and ask users...

zhangguanheng66

cla signed

Update experimental vectors without unk tensor

1

Remove the unk tensor and allow users to add one if necessary.

zhangguanheng66

cla signed

Remove <unk> token and index from experimental Vocab

6

This PR is to remove the default `''` token along with the index from `experimental.vocab`. Fix https://github.com/pytorch/text/issues/1016 In the experimental vocabulary, there will be no special symbols or user reserved...

zhangguanheng66

cla signed

Switch back to base as the default directory

1

zhangguanheng66

cla signed

Starting removing to_ivalue() and implementing jit __prepare_scriptable__ interface.

2

Begin trying to use the new interface in https://github.com/pytorch/pytorch/pull/45645

dongreenberg

cla signed

text
text copied to clipboard

Metadata

find_match utility function to unify a few lines of code

Use _PATHS for IMDB

[WIP] Build docs in parallel

[RFC] Prototype pretrained models in torchtext

Add sentencepiece to BERT model

Add setitem func to torchtext.experimental.vocab.Vocab

Update experimental vectors without unk tensor

Remove <unk> token and index from experimental Vocab

Switch back to base as the default directory

Starting removing to_ivalue() and implementing jit __prepare_scriptable__ interface.

← Metadata

Owner

Metadata

text text copied to clipboard

Metadata

← Metadata

Owner

Metadata

text
text copied to clipboard