kacwin

Results 2 comments of kacwin

Thanks for the info, using gradient accumulation would be kind of a last resort; there simply is too much data. I will try some LoRA experiments in the near future,...

Hello, we did some experiments with LoRa finetuning. - We started with Caformer_b36_384 and did linear probing (freezing the network aside from the mlp head) on a classification task with...