Samuel Azran
Samuel Azran
Same issue here
Ha > **Bug description** when running `parlai interactive --model-file zoo:bb3/bb3_3B/model --init-opt gen/r2c2_bb3` for BB3. It gives attribute error. > > **Reproduction steps** > > 1. python setup.py develop > 2....
> ParlAI/parlai/scripts/chat_model.py Can you share your custom script please?
> Hello, thank you very much for such excellent work. We have conducted some experiments using Llama-Factory, and the results indicate that Galore can significantly reduce memory usage during full...
@Ph0rk0z any idea whats the plan for release date of further checkpoints? I think training it on more than 1 trillion tokens can give it advantage compare to other pre-trained...
> It works. Qwen's tokenizer is based on tiktoken, I add the tokenizer(tokenization_qwen.py) from its huggingface repo without any revision. This make the code a little complicate, so maybe do...
Any experience in using it for more than 4096 tokens? any idea when checkpoints trained on more than 1 trillion tokens will be ready?
I'm also wondering if it is full weights fine tuned or Lora fine tuned of Gemma? Any public info on the training of it?