TerminalLoggers.jl icon indicating copy to clipboard operation
TerminalLoggers.jl copied to clipboard

Sticky messages with subprocesses do not work well

Open tkf opened this issue 5 years ago • 6 comments

When writing https://github.com/JuliaCI/PkgBenchmark.jl/pull/104, I realized that sticky messages do not work well with sub-processes. An example:

using Logging
using ProgressLogging
using TerminalLoggers

with_logger(TerminalLogger()) do
    @progress for _ in 1:10
        code = """
        using Logging
        using ProgressLogging
        using TerminalLoggers
        with_logger(TerminalLogger()) do
            @info "From child process" getpid()
            @progress for _ in 1:10
                sleep(0.01)
            end
        end
        """
        run(`julia --startup-file=no -e $code`)
    end
end

But I'm not sure if it's probably impossible to fix this in TerminalLoggers.jl. Maybe implementing some workarounds using an environment variable could be helpful.

Ultimate solution may be to merge log records from subprocesses to the main process using IPC-based logger "bridge" (which is way outside of the scope of TerminalLoggers.jl, I guess, though).

tkf avatar Feb 24 '20 23:02 tkf

Ultimate solution may be to merge log records from subprocesses to the main process using IPC-based logger "bridge" (which is way outside of the scope of TerminalLoggers.jl, I guess, though).

Yes, this is ultimately what's required: the code passed to the subprocess would need to install a logger which forwards messages to the parent process, and the parent process would need to set up an async task to listen for those. It "might be nice" if we could make this work automatically when the child process is Base.julia_cmd(), but for generic subprocesses which "just happen" to call julia it seems difficult.

One thing which certainly seems realistic is to make this work automatically when using Distributed to spawn the child processes.

c42f avatar Feb 25 '20 00:02 c42f

Maybe implementing some workarounds using an environment variable could be helpful.

True, an environment variable could be used to define some "julia log server" context and that could be used to initialize the default global logger to something other than ConsoleLogger.

Hopefully getting the IPC right is not too much of a nightmare. For functionality on localhost (without authentication or encryption) I guess we could just use serialize and a socket. This seems very similar to the work already done in Distributed.LocalManager though, where the IPC side of things might already be sorted out.

So maybe getting it working with Distributed is the best thing to do first.

c42f avatar Feb 25 '20 03:02 c42f

It "might be nice" if we could make this work automatically when the child process is Base.julia_cmd(), but for generic subprocesses which "just happen" to call julia it seems difficult.

Yeah, I think that'd be super cool. I cannot decide if it is too much of a magic to happen automatically, though.

I agree adding a support for processes created via Distributed is the first step.

Maybe implementing some workarounds using an environment variable could be helpful.

True, an environment variable could be used to define some "julia log server" context

The workaround I was thinking was something stupid-simple: just use the "Sys.iswindows() code path" if the current process shares the same TTY with the parent process. But maybe it's worth doing this right from the getgo as the pitfall like the OP would probably be very rare anyway. Compared to that, logging inside Distributed is much more common. In fact, I already created something like this just for progress information in Transducers.jl to support progress logging in distributed reduction. I can throw it away if logging across workers just works.

tkf avatar Feb 25 '20 04:02 tkf

I cannot decide if it is too much of a magic to happen automatically, though.

Haha, I suspect it probably is :-) If log record serialization is done via serialize we preserve full Julia objects which is great. But we get some other tricky problems like making sure all processes have the same modules loaded. So making it just work outside Distributed might be a bit of a can of worms.

c42f avatar Feb 25 '20 05:02 c42f

Ah, yes, serialization is super tricky when two processes can activate different projects.

Hmm... I wonder if it's better to lower everything into (say) JSON-encodable primitive types (or msgpack/BSON/etc.) at each process. Probably there are already several language-agnostic logging frameworks that do this.

tkf avatar Feb 25 '20 06:02 tkf

I don't think there's a one-size-fits-all approach.

I suspect that for Distributed it would be fairly natural to use serialize because Distributed seems to target cases where all processes are part of the same application. Also for HPC workloads where the logs may be serialized to disk per node for performance reasons, but it's still conceptually a "single application".

For more heterogeneous distributed systems I think the JSON-or-similar approach will be needed. There's a whole lot of log servers for the receive side which already do this kind of thing, so we could choose one of the better ones and take some conventions from that.

c42f avatar Feb 25 '20 07:02 c42f