Giovanni Puccetti
Giovanni Puccetti
@rom1504 I am moving forward, if you have time could you just have a look at how the cross attention and decoder are added to existing models to see if...
@iejMac. thanks for the feedback! The reason for those `CoCa(Text/Vision)Transformer` models was that I think the forward of `(Text/Vision)Transformer` needs to be changed a bit, however after thinking about it,...
Hi @iejMac, I should have done most of it, if you can have a look it would be highly appreciated! The one thing I did different from what you said...
Also I don't know why but it looks like pytest is not in the environment used by github
@iejMac One thing, is the idea that either both tokeniser and model are HF or neither of them are? Or is mixed allowed?
I will do all the above, about not running, I was changing some things right now, now it should run, though it is not finished yet
> ok looking at the config further, it seems very different from clip configs could you add a config looking as much as possible like clip ViT-B/32 so we could...
> ok yeah it's not about gradient checkpointing. with local batch size of 1, it OOM on 80GB a100 I see it running with the same config as the test,...
> ah I ran it in multi-node mode, could there be something wrong with distribution ? yeah that could indeed be I am not 100% confident about it and I...
@rom1504 and also in single node I see it running https://wandb.ai/gpucce/open-clip/runs/1s3zlgri