open_llama
open_llama copied to clipboard
tokenization issue for code
Does this still a bug for tokenization? I want to use this for code. Thanks!
If you are talking about the fast encoder, it was fixed in the main branch of transformers. AFAIK it hasn't been tagged as a release, yet.
Probably a duplicate of #40?
Check out our OpenLLaMA v2 model, which has a new tokenizer and is pretrained with a lot of code. The official release of that will happen very soon.
can we use the old models or how does this work? We just load the old model with the new tokenizer?
Brando Miranda Ph.D. Student Computer Science, Stanford University EDGE Scholar, Stanford University @.*** website: https://brando90.github.io/brandomiranda/home.html mentorship opportunities: https://brando90.github.io/brandomiranda/prospective-collaborations.html
On Jul 7, 2023, at 12:52 AM, Xinyang (Young) Geng @.***> wrote:
Check out our OpenLLaMA v2 modelhttps://huggingface.co/openlm-research/open_llama_7b_v2, which has a new tokenizer and is pretrained with a lot of code. The official release of that will happen very soon.
— Reply to this email directly, view it on GitHubhttps://github.com/openlm-research/open_llama/issues/61#issuecomment-1624940429, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOE6LRSBDRPTWUHHJTCYHDXO654DANCNFSM6AAAAAAZWCI5DE. You are receiving this because you authored the thread.Message ID: @.***>
@brando90 The v2 model is a completely different one trained on a new mixture of dataset, so you'll need to load the new weights too.
Got it. Thanks!
I will assume v1 open llama is basically unusable for code gen (what I want) and use only v2.
Thanks!
Brando Miranda Ph.D. Student Computer Science, Stanford University EDGE Scholar, Stanford University @.*** website: https://brando90.github.io/brandomiranda/home.html mentorship opportunities: https://brando90.github.io/brandomiranda/prospective-collaborations.html
On Jul 7, 2023, at 11:55 AM, Xinyang (Young) Geng @.***> wrote:
@brando90https://github.com/brando90 The v2 model is a completely different one trained on a new mixture of dataset, so you'll need to load the new weights too.
— Reply to this email directly, view it on GitHubhttps://github.com/openlm-research/open_llama/issues/61#issuecomment-1625894602, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAOE6LTTBRYD5CYUEVWXAGDXPBLQXANCNFSM6AAAAAAZWCI5DE. You are receiving this because you were mentioned.Message ID: @.***>
@brando90 Yeah. I imagine you probably want to use v2 almost always since it is a better model overall.