verl
verl copied to clipboard
How to debug parallel ray
I am currently learning to use this library, I am not familiar with ray's parallel strategy, is there any way to debug this process (such as single-step running the large model generation process of training batches), or single-step running during the debug process.
You can try ray distributed debugger
@PeterSH6, would you mind adding a "ray" label to this issue? I am triaging Ray-related issues in veRL. Thanks!
- Add a
breakpoint()in your code where you want to stop execution. - Run the code and it will pause at the breakpoint.
- Use
ray debugin the terminal to enter the interactive PDB debugger.
- Use
ray debugin the terminal to enter the interactive PDB debugger.
How can I interactively debug using Ray in the terminal? python -m pdb job.py?
You can try ray distributed debugger
seems not work for me.... env: vscode/pycharm + ray debugger + remote ssh [docker]
You can try ray distributed debugger
seems not work for me.... env: vscode/pycharm + ray debugger + remote ssh [docker]
Have you find a solution ? I also meet the same question.
I just gave up using Pycharm due to the firewall issue with my remote-server.. instead, I use Vscode now. It seems that Pycharm is not good for ray development... But you can follow this question.
Nice reply, thx. I just find #1474 also discuss a relevant question. May it helps.