Andrew Fitzgibbon

Results 26 issues of Andrew Fitzgibbon

One of my dreams for this package was to turn code like this ```python query = linear(head.query, t1) # L x Dk key = linear(head.key, t1) # L x Dk...

question

The following code passes typechecking, and runs without error ```Python import jax from typeguard import typechecked as typechecker from jaxtyping import f32, u, jaxtyped @jaxtyped @typechecker def standardize(x : f32["N"],...

Good documentation for btime: ``` help?> @btime @btime expression [other parameters...] Similar to the @time macro included with Julia, this executes an expression, printing the time it took to execute...

On today's head, 48b8ac, build.cmd fails to find FAKE: ``` PS C:\dev\GitHub\fsharp-cheatsheet\tools> .\build.cmd Unable to find package 'FAKE'. The system cannot find the path specified. PS C:\dev\GitHub\fsharp-cheatsheet\tools> .nuget\nuget.exe install FAKE...

From https://github.com/awf/functional-transformer/discussions/4#discussioncomment-2834638_ See how we might include the `sin(t)` terms, rather than just 'learned' encodings.

WandB loss curves (e.g. [here](https://wandb.ai/awfidius/pure-transformer/runs/ehf0othc)) show a sawtooth form, correlated with batch ID. Batches are [randomized](https://github.com/awf/awf-jaxutils/blob/2590cc78a4ab017e0f6bcd1ccded1f63bbd9fc6a/dataset.py#L67) and this occurs even with [1-bit gradients](https://github.com/awf/functional-transformer/blob/780073081d65df06a5c0c31dc4f9d2c8285625a0/main.py#L178), so it's not Adam... ![image](https://user-images.githubusercontent.com/128119/170712450-87f10f16-99f5-4a22-b689-2bdb0ec5952e.png)

### 🐛 Describe the bug It would be nice to be able to compile torchtext models, e.g. RobertaModel: ```py import torch from torchtext.models import RobertaEncoderConf, RobertaModel, RobertaClassificationHead roberta_encoder_conf = RobertaEncoderConf(vocab_size=250002)...

triaged
oncall: pt2
module: dynamo

Doing this slightly in reverse, as I submitted the PR before the request, but... Sometimes we may index the same repository at two different refs, or otherwise want to customize...

As noted in https://github.com/awf/functional-transformer/discussions/6 the model does not match the original code, or indeed the original transformer paper. I therefore consider this a "transformer variant", but of course it would...

Currently we're using the older DV/DM in DiffSharp. We should add a new tool folder DiffSharp-Tensor, using the dev branch 1.0 API.

enhancement