Srinivas Billa issues

Results 35 issues of


                                            Srinivas Billa

Mixtral MoE support?

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged...

enhancement

[Feature Request] Optimized quantised kernels

https://github.com/IST-DASLab/marlin

Not able to run AWQ Mixtral on 4xA10

Hi, Im trying to run the AWQ version of Mixtral on 4xA10s. However im getting this error. Ive also tried with `--mem-frac 0.7` and still got the same error Model...

bug

Cogvlm reference in blog.

Hi, Just wanted to check. Isnt the cogvlm model actually 17b params. Not 30? Thanks

Rlhf data collection feature

Is it possible to add a way to generate multiple drafts for a given input. And then based on what the user picks save that data so that it can...

enhancement

front

back

How to apply LoRA for TTS models

Hi, I've been trying to apply LoRA to the VITS model (hence the pull request for the conv1d). Turns out just using Lora for the text encoder transformer isn't enough,...

Galore + Lora?

Hi, Sorry if this is stupid question but, is it possible to use the 8bit galore optimiser in combination with LoRA adapters? Thanks

[Feature Request] DDP

Wanted to make an issue for this instead of constantly asking in discord. I saw the other ticket for multigpu fp16 training which is also nice. But ddp would let...

currently fixing

PowerInfer : using a combination of cpu and gpu for faster Inference

Splitting hot and cold neurons across cpu and gpu allows faster Inference when using larger models/higher quantisations. Demo shows 11x speedup over llama.cpp when using a 40b on a single...

feature request

Packing without cross contamination

Related to #1194 , using packing deteriorated performance as samples in my dataset are not independent. And the correlation might have caused the issue. However packing did help my training...

enhancement

help wanted

good second issue