Yinghai Lu

Results 38 comments of Yinghai Lu

Pushed a change to keep the behavior unchanged for windows.

If it's windows, then I don't know how to solve it but maybe it's in the same line of idea.

How does run_rpc work if we want to bcast this to each engine and run it exactly once? How to guarantee that each engine core runs it in lock step...

There isn't a lot of work in apiserver that needs multiprocessing right? It's mostly async_llm, most specifically MM data handling that needs scale out?

Trying to understand the nature of the issue, looks like in new transformers version if it already has the Qwen2VL image processors. ``` ValueError: '' is already used by a...

I had the same issue. client ``` KINETO_USE_DAEMON=1 KINETO_DAEMON_INIT_DELAY_S=3 python scripts/pytorch/linear_model_example.py INFO:2025-01-29 01:45:27 43221:43222 init.cpp:135] Registering daemon config loader, cpuOnly = 0 ERROR: External init callback must run in same...

``` Initiating on-demand GPU profiling for job ID 0, pids [0] ``` this seems pretty weird?

More info. When i start the script without `KINETO_DAEMON_INIT_DELAY_S=3`, on the dyno_log server side, I can see that the script PID is registered ``` I20250129 04:42:10.183240 43204 LibkinetoConfigManager.cpp:194] Registered process...

On the client side, when I don't use `KINETO_DAEMON_INIT_DELAY_S=3`, I see a bunch of kineto logs being printed out ``` INFO:2025-01-29 05:13:33 44126:44126 init.cpp:135] Registering daemon config loader, cpuOnly =...

Update I was able to build pytorch/kineto and play around, so far I think ``` ERROR: External init callback must run in same thread as registerClient (1258501696 != 107225088) ```...