gemma_pytorch icon indicating copy to clipboard operation
gemma_pytorch copied to clipboard

The official PyTorch implementation of Google's Gemma models

Results 50 gemma_pytorch issues
Sort by recently updated
recently updated
newest added

Hello! In the PyTorch implementation, in the MLP, exact GeLU is used as a gating function. ![image](https://github.com/google/gemma_pytorch/assets/78988918/84e902b1-8de1-4030-af73-f191598aa875) ![image](https://github.com/google/gemma_pytorch/assets/78988918/463c0ed1-5c41-473c-92a8-282ecbce753b) In the JAX version, the approximate gelu is used. ![image](https://github.com/google/gemma_pytorch/assets/78988918/ccb2f80f-f340-4fbd-90a6-c636330c8103) ![image](https://github.com/google/gemma_pytorch/assets/78988918/d8e99898-acd3-47fb-83e8-b134715fc98a) Could...

Build the image according to the dockerfile file, then run the container. Error: IsADirectoryError: [Errno 21] Is a directory: '/tmp/ckpt', it should be that there is no weight file in...

type:support
stat:awaiting response

After deplyed google/gemma-7b-it, there always is error response when sending any message. Response: `Of course! Here are some creative ideas for a 10-year-old's birthday party:`

bug

is it possible to convert gemma_pytorch to onnx to tflite?

type:support
stat:awaiting response

Hello there 👋 Thanks for the repo. But I have one question: why do we need to scale up (normalize) token embeddings? https://github.com/google/gemma_pytorch/blob/01062c9ef4cf89ac0c985b25a734164ede017d0b/gemma/model.py#L431-L432 Unfortunately, I cannot find an answer anywhere.

``` bash markusheimerl@t1v-n-a16d1e4e-w-0:~/gimli$ cd ~/gemma_cktp/ && curl -o archive.tar.gz "https://storage.googleapis.com/kaggle-models-data/5305/11357/bundle/archive.tar.gz?X-Goog-Algorithm=GOOG4-RSA-SHA256..." && tar -xf archive.tar.gz && cd ~/gimli markusheimerl@t1v-n-a16d1e4e-w-0:~/gimli$ cd ../gemma_pytorch/ markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ VARIANT=2b markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ CKPT_PATH=/home/markusheimerl/gemma_ckpt/ markusheimerl@t1v-n-a16d1e4e-w-0:~/gemma_pytorch$ sudo usermod -aG docker $USER...

type:support

How to fine-tune Gemma with pytorch? There seems to be fine-tuning code on Huggingface, but it cannot be used directly. Thanks

Just a few more Gemma fixes :) Currently checking for more as well! Related PR: https://github.com/huggingface/transformers/pull/29285, which showed RoPE must be done in float32 and not float16, causing positional encodings...

To ensure that the exception is handled correctly, raise should be used instead of return