Rok Novosel
Rok Novosel
Does it happen consistently every time you run the snippet? I've just ran the snippet a ~1000 times on similar configurations (mac + linux) and it did not segfault. Also...
I tried with python 3.7.7, worked for me: ``` ➜ test python Python 3.7.7 (default, Sep 22 2020, 17:05:00) [GCC 7.5.0] on linux Type "help", "copyright", "credits" or "license" for...
@camdencheek Things didn't improve on the CodeSearchNet dataset. The CSN dataset contains short keyword-like queries that don't overlap with the repo/file names. That's probably why I didn't see any improvement....
> > Maybe we need to do a better job at filtering out such low-value chunks? > > This feels like a situation where we're getting hit by vector normalization....
@camdencheek I merged your PR, removing retries from the methods. Can you check if I managed to do it correctly? 🙂
Thanks that is very helpful! 1. For gradient checkpointing, I've found this reference implementation for contrastive learning: https://sourcegraph.com/github.com/microsoft/LoRA/-/blob/examples/NLU/examples/research_projects/longform-qa/eli5_utils.py?L128-164 Is this a practical approach, or are there more effective approaches? 2....
I'm using a native PyTorch training loop, but I can try to migrate to the Trainer class for the gradient checkpointing and the DeepSpeed integration. Hopefully, that gets me to...
@intfloat One additional question: what is the difference between e5-{small,base,large} and e5-{small,base,large}-v2 models on Huggingface?