pytorch-image-models icon indicating copy to clipboard operation
pytorch-image-models copied to clipboard

Csatv2 contribution

Open rwightman opened this issue 1 week ago • 5 comments

Continuation of work in #2624 by @gusdlf93

rwightman avatar Dec 09 '25 22:12 rwightman

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gusdlf93 hey, I used claude to make some additional changes to fit timm norms a bit better, it did require remapping checkpoints though. I verified 80.024% accuracy remains.

Unfortunately the diff of the model got messed up (can't see what was changed) because your commit was a mix of CRLF and LF and it got cleaned to LF only which touched every line.

An interesting model for higher resolution.

rwightman avatar Dec 09 '25 22:12 rwightman

I may add a few more small things like grad checkpointing, and then I guess I'll push a remapped checkpoint to the timm org that references the original

rwightman avatar Dec 09 '25 22:12 rwightman

Thanks a lot for taking over and polishing the implementation. Let me know if you need any additional details about the training setup or checkpoints.

For reproducibility and detailed training recipes, I’ve documented everything in the Hugging Face model card: Link : https://huggingface.co/Hyunil/CSATv2

gusdlf93 avatar Dec 10 '25 10:12 gusdlf93

@gusdlf93 okay thanks, I'm probably not going to get a chance to merge this for a few more days, I feel it's in a good state but I have a few days off and wanted to check a few more small things.

rwightman avatar Dec 10 '25 17:12 rwightman

@gusdlf93 ready to merge, just letting final test run, I've pushed the weights to https://huggingface.co/timm/csatv2 and copied over your model card info... the timm impl of the arch now supports changing model widths/depths via args so can define other related models.

rwightman avatar Dec 12 '25 18:12 rwightman

Also, if you could clarify the license for the model card that'd be great ... is it Apache 2.0?

rwightman avatar Dec 12 '25 18:12 rwightman

The model is licensed under Apache 2.0, allowing for unrestricted commercial use.

Also, I heard that you are currently on vacation. I truly appreciate you taking the time to review this PR despite your time off. Thank you so much for your hard work and dedication to this project

gusdlf93 avatar Dec 13 '25 05:12 gusdlf93