duan

Results 6 comments of duan

Until 2024, the docs still not update yet...

Thank you for your response. I have confirmed the presence of a straggler issue. As illustrated in the attached images, the first GPU remains idle, waiting for the second GPU...

> @duanzhaol I don't think you're using compilation are you? Yes, I haven't use compile in my process. Is compile a necessary step for tensor parallel? I think it should...

I opted not to use compilation because my objective is to use tensor parallelism on a serverless platform. The initial compilation process is significantly time-consuming, which becomes impractical in our...

> @duanzhaol Out of curiosity, what level of overhead is acceptable? Maybe less than a second? In serverless if the function is pure stateless, every request need to recompile the...

Thank you for the detailed response. After modifying the warm-up code, the load time has significantly improved. My machine configuration connects every two GPUs on a single PCIe link, and...