xwuShirley

Results 10 comments of xwuShirley

> KeyError: 'class_name' Hey, @millermuttu did you solve your problem? Could you share it here? I have the sample problem for Omniglot Oops, never mind. @ChenjieSun Thanks for the solution

It seems the current update https://github.com/pytorch/examples/blob/main/imagenet/main.py#L424 :( falls back to the wrong version..... I was trying to fix this bug and fortunately found this. It would be great it's updated...

It looks like the authors have not yet released the distillation scripts, right?

@wojtek11530 have you figured out?

@pavanimajety I think after loading the model, it gives random output. For example something liket `okyrokiryLISTramaä»— bioactiveHelpers Lunilan`

@pavanimajety Thanks so much for the effort. I have tried the PR. I think I still getting weird output. This time is just "" space. On the other hand, it...

> @xwuShirley Could you share your run commands? > > This one works for me (I agree it is limited in support) - Thanks! Let me try your script. I...

``` trtllm-serve nvidia/DeepSeek-R1-FP4 \ --max_batch_size 256 --max_num_tokens 32768 \ --max_seq_len 32768 --kv_cache_free_gpu_memory_fraction 0.95 \ --host 0.0.0.0 --port 30001 --trust_remote_code --backend pytorch --tp_size 8 --ep_size 8 ``` It seems for B200,...

@Edwardf0t1 You upload the model weight https://huggingface.co/nvidia/DeepSeek-R1-FP4/tree/main. May you know about the deployment configuration for trtllm-serve? Thank you :)

It has to do with nvfp4moe gemm