Nikhil Pavan Kanaka

Results 1 issues of Nikhil Pavan Kanaka

I'm trying to train mamba2 130m from scratch. ``` config = Mamba2Config( vocab_size=len(tokenizer.vocab), n_positions=10, n_embd=768, n_layer=12, n_head=12, n_inner=3072, ) model = Mamba2ForCausalLM(config) ``` ``` training_args = TrainingArguments( output_dir=args.output_dir, logging_dir='./logs', gradient_accumulation_steps=1,...