text icon indicating copy to clipboard operation
text copied to clipboard

Models, data loaders and abstractions for language processing, powered by PyTorch

Results 221 text issues
Sort by recently updated
recently updated
newest added

Edit raw.translation dataset to return a RawTextIterableDataset, which uses worker information to restrict the underlying iterator to a subset such that DataLoader won't return duplicate entries, if given an instance...

cla signed

Update doc strings and clarify that the file object supported by the functions are the file opened in a text mode. A followup task is to add the support of...

cla signed

This PR typedefs strings that are meant to be constant, i.e. read-only. They can then be optionally replaced by std::string_view. Also adds the ever so important "#pragma once" to the...

cla signed

Create a single union regular expression that uses a lambda to query a dictionary of patterns for the correct replacement. This causes a significant speedup, however is different from the...

cla signed

Right now the default sometime is "train", "test", "valid" and sometimes (but more commonly) "train", "valid", "test". We should pick a single convention (this PR opts for the latter) to...

cla signed

Language modeling datasets construct *all* datasets even if only a subset is constructed. It also stores the fully numericalized version of the dataset if it's stored as "a single line"...

cla signed

a) The documentation doesn't clearly state that one factory function is meant to be used to construct a Vocabulary from a dataset (e.g. AG_NEWS) and another is meant to be...

cla signed

Since torchtext is built against specific torch versions, they need to be specified in the setup.py. Otherwise pip will install incompatible versions. Fixes #902