nanmi
nanmi
### System Info NVIDIA A100 ### Who can help? @seanprime7 @Superjomn @aaronp24 @lukeyeager @aflat ### Information - [X] The official example scripts - [ ] My own modified scripts ###...
Exciting work, I am very interested, but since my coding ability is weak, can you provide a CUDA code about DCA, it will be greatly appreciated
onnx模型很小,但是netron打开还是警告模型太大,打开缓慢
### System Info GPU:A100 Mem: 1007G TRTLLM: v0.9.0 G++\NVCC When I build TensorRT_LLM, Warning: Function too large, generated debug information may not be accurate.
Question: ```shell /usr/local/cuda/include/cub/agent/agent_batch_memcpy.cuh(896): error: expected an identifier 32 ^ ``` Code location: ```c++ constexpr uint32_t WARPS_PER_BLOCK = BLOCK_THREADS / CUB_PTX_WARP_THREADS; ``` I think the problem may be caused by the...