Kirill Mavreshko comments

Results 18 comments of


                                            Kirill Mavreshko

Is there any plan to implement transformer-xl?

Yes, I do have plans to implement Transformer-XL, but cannot promise how soon.

compression window error

Hi! Thanks for the report. I'll look into it.

Plans on implementing an external mask

Hi! Yes, that is a reasonable feature. However, it currently has a low priority, since I currently don't have much time and a similar result can be achieved by introducing...

Allow Self-Attention in TransformerBlock

Hi! Could you please post an example that utilizes these changes? Perhaps a function that builds a model, similar to [`vanilla_transformer_gpt_model`](https://github.com/kpot/keras-transformer/blob/master/example/models.py#L74).

[PIP] Schema compatibility check

Hi! I was thinking recently about a similar thing. For a while now Django has a feature: whenever you change your model, you can simply run its CLI [`django-admin makemigrations`](https://docs.djangoproject.com/en/4.1/ref/django-admin/#django-admin-makemigrations)...

[PIP] Schema compatibility check

@mohs8421 > For example I can have a table in a first migration. But in the next migration, I want to add a reference to an other new table. Which...

Incorrect initialization of pseudoinverse matrix calculation leads to convergence failure

@yyxiongzju Good point, I completely missed the fact (and the comment!) that `iterative_inv` works exclusively with post-softmax matrices. In that case your initialization is perfectly correct. Thanks for the explanation!...

Incorrect initialization of pseudoinverse matrix calculation leads to convergence failure

@yyxiongzju On top of previous question. In the paper you say "For all our experiments, we need to run about 6 iterations in order to achieve a good approximation of...

Incorrect initialization of pseudoinverse matrix calculation leads to convergence failure

@yyxiongzju Thanks for the details! I've thrown together a notebook containing an [alternative version of `iterative_inv`](https://github.com/kpot/Nystromformer/blob/new_inv/notebooks/iterative_inv_convergence_test.ipynb) along with some of my experiments. Could you please take a look? So far...

Incorrect initialization of pseudoinverse matrix calculation leads to convergence failure

@yyxiongzju Thank you for your time, you did a good job explaining all this! I totally agree with you on that everything should be practical in such things. If a...