SongChenchuan
SongChenchuan
Will u add some plugable function for sorting queue priority? In my case I would check whether the queue have the emergency task need be scheduled first
没有任何错误,就是卡在下载歌曲上了,也无法上下左右ctrl+c
> I wanted to share that we've been running a build where we replaced the RateLimitingInterface with the default workqueue. We've seen a significant reduction in reconciliation delay, even without...
I have patched and tested this pr but Watchdog caught collective operation timeout problems still have here is the log output ``` [rank4]:[E223 11:14:34.991762039 ProcessGroupNCCL.cpp:616] [Rank 4] Watchdog caught collective...
> @yizhang2077 @zhyncs I went back to the first commit and register the pynccl algather as a pytorch custom op as you suggested. Ideally, it would be nice to get...
I have installed datasets and the issue still exists, seems not the dependency problem
> this is the issue in latest docker build how to enable the subprocess logging or watch the subprocess log? any tips would help are welcome