Nikhil Pavan Kanaka
Results
1
issues of
Nikhil Pavan Kanaka
I'm trying to train mamba2 130m from scratch. ``` config = Mamba2Config( vocab_size=len(tokenizer.vocab), n_positions=10, n_embd=768, n_layer=12, n_head=12, n_inner=3072, ) model = Mamba2ForCausalLM(config) ``` ``` training_args = TrainingArguments( output_dir=args.output_dir, logging_dir='./logs', gradient_accumulation_steps=1,...