yury-tokpanov
yury-tokpanov
Is it still under review? I don't see it listed in the documentation: https://www.backblaze.com/docs/cloud-storage-command-line-tools#usage But it's there if I do `b2 --help`
Thanks! You should probably put a link to that doc on your website as well.
@tlrmchlsmth thank you very much for your work! Kindly asking, do you have any updates since your last post?
@tlrmchlsmth Thanks for the update! I work at Zyphra, and we are interested in incorporating our Zamba2 model into vLLM (#9382). I'm using your PR as a starting point, since...
@tlrmchlsmth @fabianlim thanks for all your work! I have our internal implementation of Zamba2 based of previous version of this PR. I'm going to rebase it. Would you recommend using...
I am unable to reproduce eval results for our Zamba2 model with lm_eval both for some loglikelihood tasks (winogrande, arc tasks) and generation tasks (like gsm8k), while some loglikelihood tasks...
The computation of gated RMS norm depends on the number of Mamba2 groups: https://github.com/state-spaces/mamba/blob/0cce0fa645f100f00620ddf2333c2b7712abfdec/mamba_ssm/ops/triton/layernorm_gated.py#L32 . Our 7B model has 2 groups, so it definitely affects it. I'm still chasing other...
After fixing gated rms norm, I was able to match gsm8k results for our 7B model. I still see some tasks numbers being lower for some reason, so going to...
> @yury-tokpanov could you share what you did to fix gated rms norm? I don't see n_groups being handled in zamba here https://github.com/huggingface/transformers/blob/main/src/transformers/models/zamba2/modeling_zamba2.py#L64-L79 We have a new PR in transformers...
UPDATE: no more the issue when using upstream vllm. ~~I rebased using the latest version of this PR, and now I'm getting this error from `torch.ops._vllm_fa2_C.varlen_fwd()` in `vllm/vllm_flash_attn/flash_attn_interface.py:173` even though...