Birch-san
Birch-san
> rather than one map shared by all str() calls: use the map key to distinguish between instances of str(), and so have a map entry per str() call 676a30c0e9f8b433a5128a0c39efeef78d60b741...
@martinetd thanks for trying out the branch 🙂 yeah, I think this implementation supports everything, and doesn't break anything else. for the platform I was testing on, I think I...
> detailled answer ah, that was something long overdue — I'd left this PR to languish, so wanted to at least document the blockers. > do the same in a...
> This is probably a reasonable behaviour? @tomchristie the problem is that the thing the application may be doing on startup, is downloaing a giant file. I was launching [ialacol](https://github.com/chenhunghan/ialacol)'s...
btw, I have a speed boost for the decoder here: https://github.com/huggingface/diffusers/pull/1203 eliminates a `sqrt()` and a multiply, simplifies 4D tensor to 3D (num_heads is always 1), uses batch matmul. yes,...
> Nightly builds of python-3.11 wheels is already published to https://download.pytorch.org/whl/nightly how do I install these? https://conda.anaconda.org/pytorch-nightly/osx-arm64 I only see python 3.10 builds there. conda couldn't find any 3.11-compatible pytorch...
okay, so Linux x86 only? thanks for the link. if an osx-arm64 build were made available: I'd happily try it out.
to aid understanding, here's what the diff looks like: it changes the second-order step. when we compute `r`: we no longer take the ratio "`h_last` over `h`". instead: we compute`r`...
that's fantastic news! thanks; I hope it turns out to be as straightforward as you're anticipating. 🙂
> only one choice: unpacking the nested tensors in the python interface and calling the C++ API on the buffers individually this is _probably_ fine, because for image training the...