openfold
openfold copied to clipboard
About the memory consumption
Hi, When i read the support information of AlphaFold2, I got confused about the "1.11.8 Reducing the memory consumption". It said that when using the technique called gradient checkpoint, the memory consumption can be reduced to square size from cubic size when training. And when making inference, the set of the chunk of layers can also change the memory from cubic size into square size. I don't know why this can be done? Can anyone give me a hand?
This all happens automatically. The inference-time "chunking" you described is controlled by the "chunk_size" parameter of the config. Activation checkpointing is controlled by "blocks_per_ckpt".