Guanheng George Zhang comments

Results 42 comments of


                                            Guanheng George Zhang

Helper function for Vocab object

You can use `itos` attribute in the vocab object. Just FYI, this vocab will be retired and replaced with `torchtext.experimental.vocab`. cc @parmeet

Starting removing to_ivalue() and implementing jit __prepare_scriptable__ interface.

Run a quick test again the two PRs ``` import torch import torchtext from torchtext.experimental.transforms import PRETRAINED_SP_MODEL from torchtext.experimental.transforms import sentencepiece_tokenizer sp_model_path = torchtext.utils.download_from_url(PRETRAINED_SP_MODEL['text_unigram_25000']) spm_tokenizer = sentencepiece_tokenizer(sp_model_path) jit_spm_tokenizer = torch.jit.script(spm_tokenizer)...

Starting removing to_ivalue() and implementing jit __prepare_scriptable__ interface.

It seems that calling '_prepare_scriptable_' in `torch.jit._script.script()` function works for the building blocks in torchtext (a.k.a. calling __prepare_scriptable__ function on L897 in `torch/jit/_script.py` file).

[DO NOT DELETE OR MERGE] Nightly build

@mthrok Is that OK to fail the CI tests for this PR?

Experimental bucket by sequence length sampler

> @zhangguanheng66 I'm proposing a sampler class with similar functionality as the [BucketIterator](https://github.com/pytorch/text/blob/bcb9104680eb9dc978a6bbcc2b9ca46cf2bdbed9/torchtext/data/iterator.py#L241). Let me know what you think of this. Thanks! Thanks @akurniawan . Since there have been several...

Experimental bucket by sequence length sampler

> By that you mean with a real dataset? Because I have already put a dummy dataset with DataLoader on the test file Right. I missed that. Then, we are...

Experimental bucket by sequence length sampler

> @zhangguanheng66 are we still interested in continuing this PR? Since we are going to remove `Field` class and associated legacy code in v0.8.0 release, I'm wondering if we can...

Experimental bucket by sequence length sampler

> @zhangguanheng66 Thank you for the idea. If you don't mind, could you help to elaborate more on the `groups the examples with similar lengths together`? I would like to...

Experimental bucket by sequence length sampler

> > They are very similar. One difference is the current implementation in this PR requires a boundary or "bucket". What happen if users don't provide the buckets? > >...

How do I load data from a csv file

It's probably similar to the text classification datasets [here](https://github.com/pytorch/text/blob/master/torchtext/datasets/text_classification.py). @mttk Do you know if the current library support a csv file loading?