transformer-xl issues

Why use memory with LMShuffledIterator

Hi, thanks for the great code. Since LMShuffledIterator are putting different samples (unrelated sentences) in different batches, how is the memory still useful? Thanks

serkansulun

Sin/Cos concatenation in Positional Embeddings

1

This is how the positional embeddings matrix is constructed in the code: ```` sinusoid_inp = torch.ger(pos_len,self.inv_freq) pos_emb = torch.cat([sinusoid_inp.sin(), sinusoid_inp.cos()],dim=-1) ```` This basically creates a matrix of [sin | cos]...

zainsarwar865

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

I have this issue with Positional Embedding at this line: `sinusoid_inp = torch.ger(pos_seq, self.inv_freq)` The following error message is like this ``` /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [170,0,0], thread: [32,0,0] Assertion `srcIndex...

demdecuong

Can someone please tell me on what dataset was transformer-XL pre-trained on?

pathak-aman

Google one-billion experiments

4

with config: python train.py --cuda --data ../data/one-billion-words/ --dataset lm1b --adaptive --n_layer 18 --d_model 1024 --div_val 4 --n_head 8 --d_head 128 --d_inner 4096 --dropout 0.0 --dropatt 0.0 --optim adam --log-interval 5...

jiahuigeng

tf 2.x and python 3.x

1

Is there an implementation that is based on tf2.x and python3.x?

ghost

TF base model memory requirements

Hi, I was trying to reproduce the WT_103 lm results with TF models. I was running pytorch ones with batchsize 60 4xV100 16G GPUs fine. However if I change to...

tonytan48

wrong argument order of _update_mems function!

1

In transformer-xl/pytorch/mem_transformer.py, I found the argument order of _update_mems ftn(member of class MemTransformerLM) is wrong! < See difference > 619th line : def _update_mems(self, hids, mems, qlen, mlen): 733th line...

jech2

PyTorch: pretrained models

5

Hi, thanks for the releasing the TensorFlow and PyTorch code for your Transformer-XL :heart: I would like to ask, if you plan to provide some pre-trained models for the PyTorch...

stefan-it

Pytorch questions!

1

I use torch=1.4.0, and python=3.6.4 running sh on only one GPU (the corporation condition is poor) -.- it indicates /pytorch/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:19: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior...

garysun1994

transformer-xl
transformer-xl copied to clipboard

Metadata

Why use memory with LMShuffledIterator

Sin/Cos concatenation in Positional Embeddings

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

Can someone please tell me on what dataset was transformer-XL pre-trained on?

Google one-billion experiments

tf 2.x and python 3.x

TF base model memory requirements

wrong argument order of _update_mems function!

PyTorch: pretrained models

Pytorch questions!

← Metadata

Owner

Metadata

transformer-xl transformer-xl copied to clipboard

Metadata

← Metadata

Owner

Metadata

transformer-xl
transformer-xl copied to clipboard