Keumgang Cha
Keumgang Cha
The layerwise decay has different value in yaml file. The vitl16 short has 0.9 layerwise decay. The vitl14 and vitg14 have 1.0 layerwise decay. Why do they have different value?...
I think that the cuda kernel files in `detection/ops/src/cuda` did not support fp16 or bf16. However, the fp16 is set to train with mixed precision in config. I confused whether...
* The mae and eva02 are vision transformer. * I think that the layer wise lr factor function has to be applied both or not. * however, eva02 apply the...