transformer-xl icon indicating copy to clipboard operation
transformer-xl copied to clipboard

Results 98 transformer-xl issues
Sort by recently updated
recently updated
newest added

Hi, thanks for the great code. Since LMShuffledIterator are putting different samples (unrelated sentences) in different batches, how is the memory still useful? Thanks

This is how the positional embeddings matrix is constructed in the code: ```` sinusoid_inp = torch.ger(pos_len,self.inv_freq) pos_emb = torch.cat([sinusoid_inp.sin(), sinusoid_inp.cos()],dim=-1) ```` This basically creates a matrix of [sin | cos]...

I have this issue with Positional Embedding at this line: `sinusoid_inp = torch.ger(pos_seq, self.inv_freq)` The following error message is like this ``` /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [170,0,0], thread: [32,0,0] Assertion `srcIndex...

with config: python train.py --cuda --data ../data/one-billion-words/ --dataset lm1b --adaptive --n_layer 18 --d_model 1024 --div_val 4 --n_head 8 --d_head 128 --d_inner 4096 --dropout 0.0 --dropatt 0.0 --optim adam --log-interval 5...

Is there an implementation that is based on tf2.x and python3.x?

Hi, I was trying to reproduce the WT_103 lm results with TF models. I was running pytorch ones with batchsize 60 4xV100 16G GPUs fine. However if I change to...

In transformer-xl/pytorch/mem_transformer.py, I found the argument order of _update_mems ftn(member of class MemTransformerLM) is wrong! < See difference > 619th line : def _update_mems(self, hids, mems, qlen, mlen): 733th line...

Hi, thanks for the releasing the TensorFlow and PyTorch code for your Transformer-XL :heart: I would like to ask, if you plan to provide some pre-trained models for the PyTorch...

I use torch=1.4.0, and python=3.6.4 running sh on only one GPU (the corporation condition is poor) -.- it indicates /pytorch/aten/src/ATen/native/cuda/LegacyDefinitions.cpp:19: UserWarning: masked_fill_ received a mask with dtype torch.uint8, this behavior...