isky-cd

Results 7 comments of isky-cd

do sample, top_p, top_k is fixed in this [pr](https://github.com/hpcaitech/ColossalAI/pull/5202)

+1, sglang==0.4.3 encountered the same issue on Deepseek-R1 2*8 H200. node 1: python3 -m sglang.launch_server \ --model-path /data/deepseek-ai/DeepSeek-R1 \ --tp 16 \ --dist-init-addr :20000 \ --nnodes 2 \ --node-rank 0...

This is an occasional error. After the worker node hangs for a while and then resumes normal operation, if it executes smoothly from there on, the services on the two...

> Hi [@isky-cd](https://github.com/isky-cd) , [#3424](https://github.com/sgl-project/sglang/issues/3424) seems to be fixed by PR [#3709](https://github.com/sgl-project/sglang/pull/3709). Could you please pull the latest branch and see whether this bug can be solved? Okay, I'll try...

We have not yet adapted ChatGLM, but we will adapt these general models in the future.

> Liger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%. We...