Yu Chin Fabian Lim comments

Results 24 comments of


                                            Yu Chin Fabian Lim

[Model] Support Mamba2 (Codestral Mamba)

@yury-tokpanov no i have never tried yet reproducing benches on vllm. I have to try it myself

[Model] Support Mamba2 (Codestral Mamba)

@yury-tokpanov with @tlrmchlsmth's help we have verified also the `gsm8k` number for bamba against the published [benchmark](https://huggingface.co/blog/bamba). HF: `0.3662` VLLM: `0.3700`

[Model] Support Mamba2 (Codestral Mamba)

@yury-tokpanov I can reproduce the `arc-challenge` results on bamba HF ``` 2025-01-29:12:20:32,138 INFO [evaluation_tracker.py:206] Saving results aggregated 2025-01-29:12:20:32,166 INFO [evaluation_tracker.py:287] Saving per-sample results for: arc_challenge hf (pretrained=ibm-fms/Bamba-9B,dtype=float16,trust_remote_code=True), gen_kwargs: (None), limit:...

[PP + EP][Master Thread] Enable Pipeline Parallelism (PP) and Expert Parallelism (EP)

i have some examples with [moe kernel ](https://github.com/fabianlim/mamba/pull/1) and [EP](https://github.com/fabianlim/mamba/pull/2)