Universal-Transformer-Pytorch issues

Results 10 Universal-Transformer-Pytorch issues

Sort by recently updated

It looks like ACT does not upate state

`state` passed to `fn` does not seem to be updated by ACT's masks, only `previous_state` ? https://github.com/andreamad8/Universal-Transformer-Pytorch/blob/master/models/UTransformer.py#L280 As such the dynamic halting seems to only kick in once all halting_probabilities...

spolu

ReLU in PositionwiseFeedForward

[Here](https://github.com/andreamad8/Universal-Transformer-Pytorch/blob/e6b06375269e805a23acbb07ef1aa4d6402bce52/models/common_layer.py#L320) `i` is the index of `self.layers`, therefore it is always less than the length of `self.layers`. Probably you mean ``` if i < len(self.layers) - 1 ``` Then no...

zhiqihuang

ImportError: cannot import name 'babi' from 'torchtext.data.metrics'

ImportError: cannot import name 'babi' from 'torchtext.data.metrics' Name: torchtext Version: 0.10.0

pypancho

Question about PE

Hi, I did notice you implement the function to calculate the position embedding. However, I found nowhere it was used. Can you please help me understand how you incorporate the...

smiles724

Unable to reproduce results (tested on Task 1 & 2)

Hi, I ran the experiments on the 10K setting, but my results are way worse than the reported ones. I didn't change any of the default parameters except from setting...

bvanaken

probability exceed threshold at step 2 from second epoch onwards

hi, when I run the model, I realize at first epoch it can reach max step 24, but start from second or third epoch, the probability by "p = self.sigma(self.p(state)).squeeze(-1)"...

zyzpower

if-statement on projecting embedding to hidden size

I found that in models/UTransformer.py:110&194, you have the following codes: ``` self.proj_flag = False if(embedding_size == hidden_size): self.embedding_proj = nn.Linear(embedding_size, hidden_size, bias=False) self.proj_flag = True ``` I'm confused that you...

cotitan

Execution without cuda throws error

Hi, when running the script on a machine without cuda support, I'm getting the following error: > File ".../Universal-Transformer-Pytorch/models/UTransformer.py", line 236, in forward halting_probability = torch.zeros(inputs.shape[0],inputs.shape[1]).cuda() RuntimeError: torch.cuda.FloatTensor is not...

bvanaken

"task" argument has no effect

Hi, currently the `--task` argument is being ignored, due to line 153ff in `main.py`, so the script always runs all bAbi tasks in a row.

bvanaken

Manifesting interest to this work

Hi, i found this implementation very interesting. I would like to understand more about Universal Transformer since i think this could allow much smaller LLMs with higher performance. p.s. i...

BoccheseGiacomo

Universal-Transformer-Pytorch
Universal-Transformer-Pytorch copied to clipboard

Metadata

It looks like ACT does not upate state

ReLU in PositionwiseFeedForward

ImportError: cannot import name 'babi' from 'torchtext.data.metrics'

Question about PE

Unable to reproduce results (tested on Task 1 & 2)

probability exceed threshold at step 2 from second epoch onwards

if-statement on projecting embedding to hidden size

Execution without cuda throws error

"task" argument has no effect

Manifesting interest to this work

← Metadata

Owner

Metadata

Universal-Transformer-Pytorch Universal-Transformer-Pytorch copied to clipboard

Metadata

← Metadata

Owner

Metadata

Universal-Transformer-Pytorch
Universal-Transformer-Pytorch copied to clipboard