openfold icon indicating copy to clipboard operation
openfold copied to clipboard

Does Training support Chunking?

Open gofreelee opened this issue 3 years ago • 7 comments
trafficstars

Hello, I look forward to observing the dynamics of the network during training. I noticed that Chunking is turned off during training, and there is related assert in code, so is it not supported to start Chunking during training?

gofreelee avatar Jul 18 '22 03:07 gofreelee

Chunking is disabled during training because it doesn't save any memory when grad is enabled---activations for each chunk are still stored for the backwards pass. Furthermore, activation checkpoints can't be inserted between chunk computations because torch/DeepSpeed checkpoints don't like it much when the same model parameters are used in two different checkpoints.

gahdritz avatar Jul 18 '22 12:07 gahdritz

Is it true that during training, no matter whether the input size changes or not, it always goes through the same calculation path?thanks in advance

gofreelee avatar Jul 20 '22 09:07 gofreelee

During training, input proteins are randomly cropped to a certain fixed length. Proteins shorter than the crop size are padded.

gahdritz avatar Jul 20 '22 16:07 gahdritz

So can I assume that during training, all training data have the same computational logic?

gofreelee avatar Jul 21 '22 03:07 gofreelee

There are a number of stochastic things that happen---e.g. the number of recycling iterations varies randomly. But within each recycling iteration, I believe that's the case, yes.

gahdritz avatar Jul 21 '22 18:07 gahdritz

I am wondering if your code support batch training? backprop and update the model based on the gradients on multiple proteins?

pengzhangzhi avatar Jul 28 '22 03:07 pengzhangzhi

We do support batch training. See the batch_size option in the config. The peak memory usage of the model is high enough that it's often impractical to increase it past 1, but for different training data or smaller crops than those used during model finetuning it might make sense.

gahdritz avatar Jul 28 '22 18:07 gahdritz