Hejon Zhang
Hejon Zhang
May I ask how you finally solved this error?
I also want to generate context logits efficiently. Can I shut off the KV cache, and set kv_cache_free_gpu_mem_fraction be a minimum value?
【报名】: 36-37、40-42、48-49、52-53、55-56、65-66
可以用类似下面的软链接解决这个问题 `ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h` `ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2.19.3 /usr/local/cuda/lib64/libnccl.so.2.19.3` `ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib64/libnccl.so.2`
> > attachment_ep_statistics.zip > > [@fzyzcjy](https://github.com/fzyzcjy) May I know where to download attachment_ep_statistics.zip? I would like to have a quick try. Besides, do these ep statistics suite for different nodes...
> [@fzyzcjy](https://github.com/fzyzcjy) May you please help me here to check what is the problem? Thanks in advance. > > I could successfully completed the first 3 steps: **step1 on node1...
How did you finally solve this problem?