Ross Wightman comments

Results 523 comments of


                                            Ross Wightman

Can't load CLIP-ViT-H-14-frozen-xlm-roberta-large-laion5B-s13B-b90k

no open_clip_config.json was pushed by whoever uploaded this model, so the hf-hub method won't work as it sourced the model config from the hub instead of open_clip...

fix coca training

So, been thinking about this one, I really don't like the is_training, it's not done this way elsewhere. The label shift is standard, but why do we need to truncate...

fix coca training

Yeah I don't like embed_cls either. Truncating the text input first, outside of the forward ala `self.encode_text(text[:, :-1])` is the 'normal' approach, but wasn't sure if that would impact the...

fix coca training

merged through #877 with minor changes

Add gen reg tests

@gpucce do you have any idea what might be causing it? what's the symptom and by how much is it 'off'? there are numerical changes across versions of pytorch, etc...

Add gen reg tests

@gpucce have you run same random inputs through the different towers, save results to verify closeness within some float eps on same env but with current main and previous release?...

Add gen reg tests

Also, not sure if this is a factor, HF generate functionality might have changed slightly over transformers versions in a way that impacted how it was being used here... On...

Add gen reg tests

FWIW using your cat.jpg, I get `'a cat sitting on its hind legs looking up . '` for both PT 2.1 w/ transformer 4.34 and latest main branch AND same...

Add gen reg tests

@gpucce I'd avoid using the singleton tokenizer by calling the open_clip.tokenize(), and us factory to get one for your model. But yeah, the coca configs say context length is 76...

Error when using torchcompile option for CLIP training

@kkjh0723 I think it might break with gradient checkpointing? not sure there is a workaround, possibly maybe using non reentrant mode?