iejMac

Results 115 comments of iejMac

Hi, Current state - we've implemented a PreTrainedTextEncoder class that has an interface compatible with the TextTransformer class from the grad caching PR. This is roughly equivalent to the first...

@rom1504 currently the working version is in [this branch](https://github.com/iejMac/open_clip/tree/dev-integration) where I added the README section. FYI: @arampacha detailed why we branched off [here](https://github.com/mlfoundations/open_clip/pull/93#issuecomment-1142593779). I'm not sure we can merge until...

@rwightman The working version is on the dev-integration branch of my fork. We separated them since that one is based on your grad caching PR. The few experiments on cc3m...

@rom1504 @rwightman what I currently have in this PR works on the stability cluster with the example files I temporarily added - test-roberta.json (config), hf_run.sh (training script) still needs testing...

Hey, nice work! Left some very minor comments but I still need to look at the HF stuff in more detail. I'll do that later

For HF stuff from my quick look I have 2 concerns: 1. Do all HF models have resize_token_embeddings? 2. What is the point of embed_cls? We already have a CLS...

@gpucce Re point 2: I think we actually want the CLS tokens to be the same. The idea would be to tune the CLS output embedding so that it's useful...

Let me know what you think and also if you're busy. I can get to it myself actually in case you are