Jeremy Jahn comments

Results 9 comments of


                                            Jeremy Jahn

Would it be possible to support LoRA fine-tuned models?

Also want this feature. Merging weight is kinda painful (merging requires large disk space).

Would it be possible to support LoRA fine-tuned models?

Just wanted to share the S-LoRA paper from Stanford and found @iiLaurens has already shared! > Just noticed a paper discussing an efficient implementation of multi-LoRA serving called S-LoRA. [Link...

Significant performance drops when using fast memory efficient attention

> @danthe3rd I also need alibi support. for now, I pass `bias = LowerTriangularMaskWithTensorBias(alibi_bias)` to `xops.memory_efficient_attention(..., attn_bias=bias )`. The forward only is ok, but failed at backward in training mode....

MS-AMP crashes with DeepSpeed ZeRO 3

@wkcn Will deepspeed ZeRO 3 be supported in the future? I saw that FSDP will be supported.

How do I use it in vllm deployment

looking forward to the support in vllm!

Distributed Training?

Same question. I saw `dist` in the [script](https://github.com/VITA-Group/Q-GaLore/blob/8200795c687c6ac3b6c69d595275dac0589b7f2b/q_galore_torch/q_galore_adamw8bit.py#L50). It's not imported by original [galore adam8bit](https://github.com/jiaweizzhao/GaLore/blob/master/galore_torch/adamw8bit.py).

Jeremy Jahn

Would it be possible to support LoRA fine-tuned models?

Would it be possible to support LoRA fine-tuned models?

Significant performance drops when using fast memory efficient attention

MS-AMP crashes with DeepSpeed ZeRO 3

How do I use it in vllm deployment

Distributed Training?

[BUG] AutoTP: incorrect total train batch size when using the huggingface trainer API

Support deepspeed dynamo

Support deepspeed dynamo