Anmol Joshi
Anmol Joshi
Thanks for getting back to me. This is a fantastic library!
@zhangguanheng66 I added the 2011 dataset per your instructions - was unable to find the validation and test sets. What are your thoughts on this?
@zhangguanheng66 I noticed an issue in writing this PR. zip and tar files are handled differently. Assuming both .zip and .tar files are stored to .data, the output of filenames...
@zhangguanheng66 as a note, this PR is able to download files correctly and setup the dataset just fine. But, it takes a very long time to create the dataset given...
@zhangguanheng66 this PR should be good to go. Let me know if you have any comments!
@zhangguanheng66 any thoughts on overloading the __iter__ method for language modeling?
Main reason for changes is due to the code below. I can revert back all my code to the original and include a if/else in [_setup_datasets](https://github.com/pytorch/text/blob/master/torchtext/experimental/datasets/language_modeling.py#L75-L125) function for non zip...
@zhangguanheng66 @cpuhrsch - I'll push a branch up later today fixing this tar/zip issue. And we can move forward with the WMT dataset after. Thanks for your review!
@zhangguanheng66 @cpuhrsch - quick question before I proceed - are the filenames returned from zip or tar correct i.e. should root folder be prepended to the path? I think the...
~Closing in favor of #700~