prefect
prefect copied to clipboard
Suppress "PID ..." log lines?
Expectation / Proposal
We're using the ShellOperation
class from this library for all of our Prefect tasks. It works great. However, some of our team would like to suppress the log lines like PID <some pid> stream output:
. Is there any way to suppress those lines in prefect logs? See screenshot below.
Perhaps there’s an env var that can be set? Perhaps it can be achieved via logging config?
If this is the wrong place for this ticket let me know!
Traceback / Example
- [ ] I would like to help contribute a pull request to resolve this!
@sckott do you just mean toggling stream_output
to False
? https://github.com/PrefectHQ/prefect-shell/blob/main/prefect_shell/commands.py#L234
Thanks @madkinsz - That does suppress the "PID ..." lines, but also all logs I think from the task. Here's I think a reproducible example:
test.py
from prefect import flow, task
from prefect_shell import ShellOperation
@task
def my_task(x, stream=True):
return ShellOperation(
commands=[
f"echo the number is {x}",
f"echo I have {x} apples",
],
stream_output=stream
).run()
@flow(name="Test")
def test():
a = my_task(x=5, stream=True)
b = my_task(x=7, stream=False)
return a, b
if __name__ == "__main__":
test()
Running python3.11 -m test
gives me:
08:59:12.165 | INFO | prefect.engine - Created flow run 'ultra-crab' for flow 'Test'
08:59:12.288 | INFO | Flow run 'ultra-crab' - Created task run 'my_task-0' for task 'my_task'
08:59:12.289 | INFO | Flow run 'ultra-crab' - Executing 'my_task-0' immediately...
08:59:12.310 | INFO | Task run 'my_task-0' - PID 92133 triggered with 2 commands running inside the '.' directory.
08:59:12.313 | INFO | Task run 'my_task-0' - PID 92133 stream output:
the number is 5
I have 5 apples
08:59:12.313 | INFO | Task run 'my_task-0' - PID 92133 completed with return code 0.
08:59:12.324 | INFO | Task run 'my_task-0' - Finished in state Completed()
08:59:12.337 | INFO | Flow run 'ultra-crab' - Created task run 'my_task-1' for task 'my_task'
08:59:12.338 | INFO | Flow run 'ultra-crab' - Executing 'my_task-1' immediately...
08:59:12.356 | INFO | Task run 'my_task-1' - PID 92134 triggered with 2 commands running inside the '.' directory.
08:59:12.358 | INFO | Task run 'my_task-1' - PID 92134 completed with return code 0.
08:59:12.368 | INFO | Task run 'my_task-1' - Finished in state Completed()
08:59:12.377 | INFO | Flow run 'ultra-crab' - Finished in state Completed()
The behavior I'm looking for is to see the logs from the echo
statements still, like:
08:59:12.165 | INFO | prefect.engine - Created flow run 'ultra-crab' for flow 'Test'
08:59:12.288 | INFO | Flow run 'ultra-crab' - Created task run 'my_task-0' for task 'my_task'
08:59:12.289 | INFO | Flow run 'ultra-crab' - Executing 'my_task-0' immediately...
08:59:12.310 | INFO | Task run 'my_task-0' - PID 92133 triggered with 2 commands running inside the '.' directory.
08:59:12.313 | INFO | Task run 'my_task-0' - PID 92133 stream output:
the number is 5
I have 5 apples
08:59:12.313 | INFO | Task run 'my_task-0' - PID 92133 completed with return code 0.
08:59:12.324 | INFO | Task run 'my_task-0' - Finished in state Completed()
08:59:12.337 | INFO | Flow run 'ultra-crab' - Created task run 'my_task-1' for task 'my_task'
08:59:12.338 | INFO | Flow run 'ultra-crab' - Executing 'my_task-1' immediately...
08:59:12.356 | INFO | Task run 'my_task-1' - PID 92134 triggered with 2 commands running inside the '.' directory.
the number is 7
I have 7 apples
08:59:12.358 | INFO | Task run 'my_task-1' - PID 92134 completed with return code 0.
08:59:12.368 | INFO | Task run 'my_task-1' - Finished in state Completed()
08:59:12.377 | INFO | Flow run 'ultra-crab' - Finished in state Completed()
I see so you're just interested in dropping the "PID ... stream output:" prefix from the logs?
Yes, just to make the logs less "noisy". Perhaps it can be done via the logging config file?
It's hard-coded at https://github.com/PrefectHQ/prefect-shell/blob/main/prefect_shell/commands.py#L160
I'd review a pull request making that prefix optional. Perhaps stream_output_include_pid: bool = True
. Another option is to get a child logger for each stream output operation that includes the PID e.g.
from logging import getLogger
logger = getLogger("test")
pid = 123
child = logger.getChild(str(pid))
print(child.name)
I think that would be a cleaner solution; this kind of information probably doesn't belong in the message itself?
I don't follow what you mean by the child logger. Is the idea there to make a custom logger which then we can strip the pid line from?
@sckott if we use a child logger with the PID in the name (in the code I linked) we can drop the prefix entirely as the information would be encoded in the logger name.
Are you proposing this as something a user like me can do in my own code, or a change one could make to prefect-shell
?
So whatever change you are suggesting, what would some example logs look like after the change?
A change in prefect-shell
. The logs would just include the output from the command. The pid would be included in the logger name.
So then the PID would not appear in the logs, but would be available via getLogger
?
Perhaps this should be toggleable, though I realize that's more complexity to maintain. I think it's likely to be useful to have the PID in the logs for debugging, though I haven't used it that way yet. My worklow thus far has been finding the docker container or process by the name of the scripts that we call via ShellOperation
, but it would be faster to know the exact PID
The PID would appear in local logs, ~but the logger name is not exposed in the UI at this time~ we're adding logger names in the UI in https://github.com/PrefectHQ/prefect-ui-library/pull/1203
Should go out in the next release @madkinsz @sckott (sooner for Cloud folks)
Hey @sckott! I've implemented using a child logger to track PIDs per @madkinsz's suggestion in PrefectHQ/prefect-shell#77. Can you install that branch and run some flows to see if it resolves your issue?
Thanks @desertaxle !
Can you install that branch and run some flows to see if it resolves your issue?
I assume I'll only be able to control what I see in local logs, yes? Just checking.
There isn't any extra control, but the process PID has been moved to a child logger. You should be able to still see the PID without it cluttering your log output.
Using the example task/flow file above https://github.com/PrefectHQ/prefect/issues/13079
pip install git+https://github.com/PrefectHQ/prefect-shell.git@pid-child-logger
python3.11 -m test2
File "/opt/homebrew/lib/python3.11/site-packages/prefect_shell/commands.py", line 402, in run
pid_logger = self.logger.getChild(str(process.pid))
AttributeError: 'PrefectLogAdapter' object has no attribute 'getChild'
Go back to the version of prefect-shell
on pypi
pip uninstall prefect-shell
pip install prefect-shell
python3.11 -m test2
10:25:56.785 | INFO | prefect.engine - Created flow run 'diamond-asp' for flow 'Test'
10:25:56.907 | INFO | Flow run 'diamond-asp' - Created task run 'my_task-0' for task 'my_task'
10:25:56.908 | INFO | Flow run 'diamond-asp' - Executing 'my_task-0' immediately...
10:25:56.928 | INFO | Task run 'my_task-0' - PID 27982 triggered with 2 commands running inside the '.' directory.
10:25:56.932 | INFO | Task run 'my_task-0' - PID 27982 stream output:
the number is 5
I have 5 apples
10:25:56.933 | INFO | Task run 'my_task-0' - PID 27982 completed with return code 0.
10:25:56.947 | INFO | Task run 'my_task-0' - Finished in state Completed()
10:25:56.959 | INFO | Flow run 'diamond-asp' - Created task run 'my_task-1' for task 'my_task'
10:25:56.959 | INFO | Flow run 'diamond-asp' - Executing 'my_task-1' immediately...
10:25:56.980 | INFO | Task run 'my_task-1' - PID 27983 triggered with 2 commands running inside the '.' directory.
10:25:56.981 | INFO | Task run 'my_task-1' - PID 27983 completed with return code 0.
10:25:56.992 | INFO | Task run 'my_task-1' - Finished in state Completed()
10:25:57.002 | INFO | Flow run 'diamond-asp' - Finished in state Completed()
I'm running prefect v2.8.3