dpgen2 feat: add streaming output for dp train

For trainning step, there is no output in log files. We can only see the output after the trainning is completed. However, the trainning normally lasts for days, so the streaming output is nessesary for users to examine the status.

The PR suggests one way to achieve this, however, a better way is to realize this directly in pydflow.

Summary by CodeRabbit

New Features
- Live streaming of training output, providing real-time visibility in the console and continuous writing to train.log.
Bug Fixes
- Eliminated duplicate training log entries by consolidating output handling, resulting in cleaner logs without redundancy.
- Maintains consistent post-training (“freeze”) logging behavior.

Aug 25 '25 07:08 OutisLi

📝 Walkthrough

Walkthrough

Introduces a new run_command_streaming utility for real-time stdout/stderr streaming and log file writing, and updates the training step in dpgen2/op/run_dp_train.py to use it with train.log. Previous in-memory logging for the train step is removed; post-train “freeze” continues using existing logging.

Changes

Cohort / File(s)	Summary
Training op: stream logging for train step `dpgen2/op/run_dp_train.py`	Replaces run_command with run_command_streaming(..., log_file="train.log") for the training phase; suppresses explicit stdout/stderr writes to fplog for train; retains existing freeze step logging to fplog.
Utilities: new streaming runner `dpgen2/utils/run_command.py`	Adds run_command_streaming(cmd, shell=False, log_file=None) that executes a subprocess with concurrent stdout/stderr streaming to terminal and optional log file, using threads; returns (exit_code, stdout, stderr). Existing run_command unchanged.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Op as run_dp_train.py
  participant RC as run_command_streaming
  participant Sh as Subprocess
  participant Log as train.log (optional)

  Op->>RC: invoke(cmd, shell=False, log_file="train.log")
  activate RC
  RC->>Sh: Popen(cmd, pipes, line-buffered)
  par Read stdout
    RC->>Sh: spawn stdout reader thread
    loop lines
      Sh-->>RC: stdout line
      RC-->>Op: stream to terminal
      RC-->>Log: append line
    end
  and Read stderr
    RC->>Sh: spawn stderr reader thread
    loop lines
      Sh-->>RC: stderr line
      RC-->>Op: stream to terminal
      RC-->>Log: append line
    end
  end
  Sh-->>RC: exit code
  RC-->>Op: (code, stdout_str, stderr_str)
  deactivate RC
  note over Op: No in-memory fplog write for train step

sequenceDiagram
  autonumber
  participant Op as run_dp_train.py
  participant RCs as run_command_streaming (train)
  participant RC as run_command (freeze)
  participant Log as fplog

  Op->>RCs: Train (streamed to train.log)
  note right of RCs: Output handled by streaming<br/>No fplog writes for train
  Op->>RC: Freeze (non-streaming)
  RC-->>Op: stdout/stderr captured
  Op-->>Log: write freeze stdout/stderr to fplog

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches

[ ] 📝 Generate Docstrings

🧪 Generate unit tests

[ ] Create PR with unit tests
[ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

Aug 25 '25 08:08 coderabbitai[bot]

The option print_oe of dflow.utils.run_command does exactly the same thing by virtue of selectors. You only need to add an argument to dflow_run_command. Refer to https://github.com/deepmodeling/dflow/blob/48c24cc4f494acb5a12c8d99293f1156c31342ad/src/dflow/utils.py#L657. Besides, I think we should provide an option rather than change the default behavior (which is silent except for error).

Thanks, is there any way to realize this in the dpgen2 configuration json?

Aug 26 '25 10:08 OutisLi

It is not supported currently. You can add an argument in config json to control the output.

Sep 02 '25 09:09 zjgemi

feat: add streaming output for dp train

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

CodeRabbit Configuration File (`.coderabbit.yaml`)