gemma-2B-10M issues

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'

9

I made a colab( https://colab.research.google.com/drive/1Z3NdoT0WS8KXnSUS3_xxT39NBZD6eGcN?usp=sharing ) to test and I ran into some issue. GemmaModel.forward() got an unexpected keyword argument 'cache_position'. I had to change some of the main.py to...

DewEfresh

Some Errors...

7

My notebook: Windows 11 Pro 23H2 Intel i7-8750H GeForce GTX 1050Ti (Mobile) 32GB RAM (2666GHz) After I removed the mention of flash_atn in gemma.py, I got the following errors: `TypeError:...

Aniforka

How run it?

2

I don't quite understand how to install and run it. I downloaded this folder from github, and downloaded all 13 files from hugging face. What's next, in which folder should...

umaruu02

Host a demo on Huggingface Spaces with free A100s (ZeroGPU)

Congratulations on this super-exciting project! It would be awesome to top it up with a live Gradio demo on [Huggingface Spaces](https://huggingface.co/spaces). I think this could help with more community engagement...

yvrjsharma

```generate()``` in main.py seems only processes the last 2048 tokens of the input prompt ?

```generate()``` in main.py seems only processes the last 2048 tokens of the input prompt ? https://github.com/mustafaaljadery/gemma-2B-10M/blob/cb97c2f686a41d4d54c259437dcdcd4f7f8da5f0/src/main.py#L15C9-L15C54 If prompt is entered with a length greater than 2048, then writing generate seems...

MrYxJ

Can I limit the context window to say like 100k?

Hi, really exciting to see 10M context window. But I don't have 32G memory. Can I limit the context window to 100k to reduce the required memory to be fit...

happy15

LoRA fine tuning code ?

Hi Can this be finetuned with LoRA without any additional script. Also, during finetuning, if we take sequence length of 512 or 1k, will it affect the inference for higher...

sandeep-krutrim

implementation for pytorch gemma InifiniTransformer is copied without attribution

The code for the model provided in this repository seems to be a copy of the repository linked below: https://github.com/Beomi/InfiniTransformer/blob/main/infini_gemma/modeling_infini_gemma.py Specifically, the GemmaInifniAttention and GemmaModel seem to be a direct...

wompwompsquared

Can you say more on memory usage eg is it useful with 24G?

For local hardware usage 24G is an interesting number (NVIDIA 3090 etc.). Can you give some idea on what folks might expect from this code on such hardware?

danbri

gemma-2B-10M
gemma-2B-10M copied to clipboard

Metadata

TypeError: GemmaModel.forward() got an unexpected keyword argument 'cache_position'

Some Errors...

How run it?

Host a demo on Huggingface Spaces with free A100s (ZeroGPU)

```generate()``` in main.py seems only processes the last 2048 tokens of the input prompt ?

Can I limit the context window to say like 100k?

LoRA fine tuning code ?

implementation for pytorch gemma InifiniTransformer is copied without attribution

Can you say more on memory usage eg is it useful with 24G?

← Metadata

Owner

Metadata

gemma-2B-10M gemma-2B-10M copied to clipboard

Metadata

← Metadata

Owner

Metadata

gemma-2B-10M
gemma-2B-10M copied to clipboard