guyueh1 comments

Results 29 comments of


                                            guyueh1

`faiss` fails to import on python==3.12 because of deprecated `numpy.distutils`

I have the same issue when trying to use faiss on an arm linux platform and numpy==1.26.0. I think the problematic import `import numpy.distutils.cpuinfo` only happend with this platform (aarch64,...

chore: Bump vllm to 0.11.2, torch to 2.9, transformers to 4.57.1

@yfw we need to update the torch version in `tools/build-custom-vllm.sh` as well

70B GRPO training is slower than reported

@butsugiri Hi, sorry for the delay, I built the container based on the dockerfile you provided and ran the test again, I can still reproduce our results in the blog....

Sequence parallelism is not supported for NemotronHForCausalLM

@joyang-nv can you confirm if this is a current limitation of dtensor?

system oom with qwen 235b

@ZhiyuLi-Nvidia I think this error is with cpu memory leak, not gpu memory. the memory leak seems to happen very ~300 steps repeatedly, it is hard to debug with limited...

feat: KV cache quantization support in fp8 rollout in GRPO

@zpqiu sorry for the long delay, I have put some comments; could you first merge in main and then address them? I think this solution can be further optimized in...

feat: KV cache quantization support in fp8 rollout in GRPO

@zpqiu can you fix the functional test failure? Also I think the L1 functionality is ran on Ampere GPUs, maybe you need to conditionally skip for cuda arch before sm_90

feat: KV cache quantization support in fp8 rollout in GRPO

> > @zpqiu can you fix the functional test failure? > > Also I think the L1 functionality is ran on Ampere GPUs, maybe you need to conditionally skip for...

feat: KV cache quantization support in fp8 rollout in GRPO

@terrykong please review

feat: KV cache quantization support in fp8 rollout in GRPO

@terrykong this is the last FP8 functionality we want to merge before v0.5, after this I want to perform a refactor of code to make it cleaner and more structured....