gemma_pytorch
gemma_pytorch copied to clipboard
The official PyTorch implementation of Google's Gemma models
Hello! In the PyTorch implementation, in the MLP, exact GeLU is used as a gating function.   In the JAX version, the approximate gelu is used.   Could...
Build the image according to the dockerfile file, then run the container. Error: IsADirectoryError: [Errno 21] Is a directory: '/tmp/ckpt', it should be that there is no weight file in...
After deplyed google/gemma-7b-it, there always is error response when sending any message. Response: `Of course! Here are some creative ideas for a 10-year-old's birthday party:`
is it possible to convert gemma_pytorch to onnx to tflite?
Hello there 👋 Thanks for the repo. But I have one question: why do we need to scale up (normalize) token embeddings? https://github.com/google/gemma_pytorch/blob/01062c9ef4cf89ac0c985b25a734164ede017d0b/gemma/model.py#L431-L432 Unfortunately, I cannot find an answer anywhere.
``` bash markusheimerl@t1v-n-a16d1e4e-w-0:~/gimli$ cd ~/gemma_cktp/ && curl -o archive.tar.gz "https://storage.googleapis.com/kaggle-models-data/5305/11357/bundle/archive.tar.gz?X-Goog-Algorithm=GOOG4-RSA-SHA256..." && tar -xf archive.tar.gz && cd ~/gimli markusheimerl@t1v-n-a16d1e4e-w-0:~/gimli$ cd ../gemma_pytorch/ markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ VARIANT=2b markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ CKPT_PATH=/home/markusheimerl/gemma_ckpt/ markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ sudo usermod -aG docker $USER...
How to fine-tune Gemma with pytorch? There seems to be fine-tuning code on Huggingface, but it cannot be used directly. Thanks
Just a few more Gemma fixes :) Currently checking for more as well! Related PR: https://github.com/huggingface/transformers/pull/29285, which showed RoPE must be done in float32 and not float16, causing positional encodings...
To ensure that the exception is handled correctly, raise should be used instead of return