pytorch-original-transformer icon indicating copy to clipboard operation
pytorch-original-transformer copied to clipboard

Error when running "python training_script.py --batch_size 100 --dataset_name IWSLT --language_direction G2E

Open minertom opened this issue 4 years ago • 3 comments

Not sure what is going on here but the best that I can tell is that there is a gzip file that seems to be missing.

Thank You Tom

Traceback (most recent call last): File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1670, in gzopen t = cls.taropen(name, mode, fileobj, **kwargs) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1647, in taropen return cls(name, mode, fileobj, **kwargs) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1510, in init self.firstmember = self.next() File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 2311, in next tarinfo = self.tarinfo.fromtarfile(self) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1102, in fromtarfile buf = tarfile.fileobj.read(BLOCKSIZE) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 292, in read return self._buffer.read(size) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 479, in read if not self._read_gzip_header(): File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/gzip.py", line 427, in _read_gzip_header raise BadGzipFile('Not a gzipped file (%r)' % magic) gzip.BadGzipFile: Not a gzipped file (b'<!')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "training_script.py", line 192, in train_transformer(training_config) File "training_script.py", line 103, in train_transformer train_token_ids_loader, val_token_ids_loader, src_field_processor, trg_field_processor = get_data_loaders( File "/home/tom/Downloads/pytorch-original-transformer/utils/data_utils.py", line 223, in get_data_loaders train_dataset, val_dataset, src_field_processor, trg_field_processor = get_datasets_and_vocabs(dataset_path, language_direction, dataset_name == DatasetType.IWSLT.name) File "/home/tom/Downloads/pytorch-original-transformer/utils/data_utils.py", line 151, in get_datasets_and_vocabs train_dataset, val_dataset, test_dataset = dataset_split_fn( File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/site-packages/torchtext/datasets/translation.py", line 144, in splits path = cls.download(root, check=check) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/site-packages/torchtext/data/dataset.py", line 191, in download with tarfile.open(zpath, 'r:gz') as tar: File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1617, in open return func(name, filemode, fileobj, **kwargs) File "/home/tom/anaconda3/envs/pytorch-transformer/lib/python3.8/tarfile.py", line 1674, in gzopen raise ReadError("not a gzip file") tarfile.ReadError: not a gzip file

minertom avatar Nov 29 '21 19:11 minertom

I got the same bug now,how to solve it?

Lyttonkeepfoing avatar Dec 12 '21 09:12 Lyttonkeepfoing

Same problem here, is there any solutions?

dejangrubisic avatar Feb 09 '23 03:02 dejangrubisic