matthiasgeihs
matthiasgeihs
I know about the issue. But are there any better alternatives on Ethereum as long as bls12-381 is not supported natively? (see [EIP-2537 discussion thread](https://ethereum-magicians.org/t/eip-2537-bls12-precompile-discussion-thread/4187/39))
@fedealconada I've been resorting to existing libraries such as `ffjavascript`.
Regarding 1. A 2^-64 bias means that the statistical distance between the uniform distribution and the actual distribution (assuming a perfect underlying rng) is 2^-64 (see the appendix of https://eprint.iacr.org/2023/1254.pdf,...
Any chance to update the model such that it runs with PyTorch 2?
can somebody fix this please?
I'm not necessarily an expert but I have some intuition why this might happen: Let's say you have a very small batch size. This means there is only very few...
You might want to watch https://youtu.be/kCc8FmEb1nY?t=867. This might clarify a few things.
*The different batches don't talk to each other* means that the model parameters are optimized **per batch**. ``` block_size ^= context_length ^= length of a training chunk batch_size ^= number...
I think you might wanna look at how which point the backprop optimization actually changes the parameters. This is done after each batch. (Also note that batch != block.)
wow, in this case i am also pretty much out of explanations. of course you can try to run with different seed. or maybe batch size has to be a...