GaLore icon indicating copy to clipboard operation
GaLore copied to clipboard

Results 46 GaLore issues
Sort by recently updated
recently updated
newest added

in sigle gpu mode,I success run the train by RTX3090.but it took too long。 in ddp mode,we got OOM in LlamaForCausalLM = torch.nn.parallel.DistributedDataParallel( model, device_ids=[local_rank], output_device=local_rank, broadcast_buffers=False, ) .

can support llava model ?

Hi, Thanks for the good work. I'm trying to intergrate this into Colossal-AI(https://github.com/hpcaitech/ColossalAI), compatible with Tensor Parallel and ZeRO. However, I had trouble loading the dataset; seems they updated the...

Hi, thanks very much for sharing your impressive work! Would it be possible to release the trained model (e.g., using the script below)? It would greatly facilitate reproducibility efforts. Thank...

[Jamba](https://huggingface.co/ai21labs/Jamba-v0.1) is a very interesting new model and I’d love to add support for galore for finetuning it. It’s an MoE+Transformer+Mamba hybrid so I’m not sure how that would work...

Hi, thank you for generously open-sourcing your excellent work. During our experiments, we noticed that there doesn't seem to be a resume/reload function for the optimizer regarding `args.continue_from`. Is our...

Attempting to use Galore to finetune a phi model yields "AttributeError: 'PhiConfig' object has no attribute 'rms_norm_eps'", which, having gotten that error on other LLM things, typically translates to "this...

Hi, thanks for releasing this work! it has all been very interesting to read. However, I do have a few questions regarding your results and methodology. 1. For table 4....

How exactly did you measure Perplexity during pre-training with GaLore? (e.g. when creating Figure 5 in your paper https://arxiv.org/pdf/2403.03507.pdf ). Thanks.