mamba Logging callbacks have too much overhead on libmambapy

Troubleshooting docs

[X] My problem is not solved in the Troubleshooting docs

Anaconda default channels

[X] I do NOT use the Anaconda default channels (pkgs/* etc.)

How did you install Mamba?

Mambaforge or latest Miniforge

Search tried in issue tracker

logging debug

Latest version of Mamba

[X] My problem is not solved with the latest version

Tried in Conda?

Not applicable

Describe your issue

2.x API includes logging callbacks so the information reported by libsolv or libmamba can be routed through a Python callable via libmambapy.solver.libsolv.Database.set_logger(). However, this becomes very slow when the libsolv throughput is too large (e.g. verbosity=DEBUG). It causes a slowdown so large that the solver takes minutes to finish.

Are there any ways to have a less resource consuming way? I don't need fancy formatting or full control, just the ability to decide when to send different levels to stdout based on the verbosity chosen by the user.

mamba info / micromamba info

This is with libmambapy 2.0.0rc3

Logs

N/A

environment.yml

N/A

~/.condarc

N/A

Aug 28 '24 07:08 jaimergp

Thank you for the report.

We have observed a tangential issue for some environments whose resolution via libsolv was hanging due to the logging system. Both issue have the same root cause: all of the possible logs of libsolv are emitted by default as we set log_level to std::numeric_limits<int>::max().

See: https://github.com/openSUSE/libsolv/blob/27aa6a72c7db73d78aa711ae412231768e77c9e0/src/pool.c#L1623-L1637

https://github.com/mamba-org/mamba/blob/08d7b4e9cb9b168d05f3bdc57c0708ee80d5a1c9/libmamba/src/solver/libsolv/database.cpp#L90-L108

https://github.com/mamba-org/mamba/blob/08d7b4e9cb9b168d05f3bdc57c0708ee80d5a1c9/libmamba/src/solver/libsolv/database.cpp#L110-L127

https://github.com/mamba-org/mamba/blob/154012f256d32bee1ae64f5b11d694cb45d75cef/libmamba/src/core/package_database_loader.cpp#L31-L53

An option can be for (release) builds not to emit debug logs unless if explicitly asked by users (with verbosity=DEBUG as you propose).

Aug 28 '24 09:08 jjerphan

#3416 addresses the performance regression. We think that the handling of log levels and of verbosity better be implemented in another PR.

Aug 28 '24 15:08 jjerphan

@jaimergp: can you report if you have observed improvements with 2.0.0rc4?

Sep 10 '24 08:09 jjerphan

@jaimergp: can you report if you have observed improvements with 2.0.0rc4?

In the sense that there are no performance overheads, yes. But I can't seem to be able to invoke obtain the libsolv logs anymore.

Isn't this sufficient?

    libmamba_context = libmambapy.Context(
        libmambapy.ContextOptions(
            enable_signal_handling=False,
            enable_logging=True,
        )
    )
    # Output params
    libmamba_context.output_params.json = context.json
    if libmamba_context.output_params.json:
        libmambapy.cancel_json_output(libmamba_context)
    libmamba_context.output_params.quiet = context.quiet
    libmamba_context.set_log_level(
        {
            4: libmambapy.LogLevel.TRACE,
            3: libmambapy.LogLevel.DEBUG,
            2: libmambapy.LogLevel.INFO,
            1: libmambapy.LogLevel.WARNING,
            0: libmambapy.LogLevel.ERROR,
        }[context.verbosity]
    )

This ^ correctly gives the libmamba logs, but not for libsolv. I thought I needed to call set_logger() on the Database instances:

        params = ChannelResolveParams(
            platforms=set(self.subdirs),
            channel_alias=CondaURL.parse(str(context.channel_alias)),
            custom_channels=ChannelResolveParams.ChannelMap(custom_channels),
            custom_multichannels=ChannelResolveParams.MultiChannelMap(custom_multichannels),
            home_dir=str(Path.home()),
            current_working_dir=os.getcwd(),
        )
        db = Database(params)
        db.set_logger(logger_callback)

with logger_callback():

def logger_callback(level: libmambapy.solver.libsolv.LogLevel, msg: str, logger=_db_log):
    # from libmambapy.solver.libsolv import LogLevel
    # levels = {
    #     LogLevel.Debug: logging.DEBUG, # 0 -> 10
    #     LogLevel.Warning: logging.WARNING, # 1 -> 30
    #     LogLevel.Error: logging.ERROR, # 2 -> 40
    #     LogLevel.Fatal: logging.FATAL, # 3 -> 50
    # }
    if level.value == 0:
        # This incurs a large performance hit!
        logger.debug(msg)
    else:
        logger.log((level.value + 2) * 10, msg)

But no luck?

Sep 21 '24 10:09 jaimergp

Yes, we entirely disabled libsolv's logs for now. We need to think about how to properly handle them as well as alternative solvers'.

Feel free to open a PR to propose a solution.

Sep 21 '24 10:09 jjerphan

Would it be reasonable to let libsolv print to stderr directly in debug level?

Sep 21 '24 10:09 jaimergp

Probably.

Edit: I do not have bandwidth to properly think about it now.

Sep 21 '24 11:09 jjerphan

Not being able to get solver logs from python is quite inconvenient. I had a case this morning where it would have saved me time to have at least some kind of logs.

Dec 19 '24 17:12 JeanChristopheMorinPerso

In the short-term, we could can consider changing those lines:

https://github.com/mamba-org/mamba/blob/c9c45806e214841b391f7349e7caf182c5fac078/libmamba/src/solver/libsolv/database.cpp#L112-L123

to add such an hard-coded logic for now:

const int level = 3;
::pool_setdebuglevel(pool().raw(), level);

Alternatively discussions in #2288 need to be resumed.

Jan 02 '25 16:01 jjerphan

Now that #4016 is merged, there is a way to provide a completely custom implementation of the logging system to libmamba to use, which in theory should help with this issue.

However, we didnt yet provide a way for python code to provide their own log-handler, it's only possible in C++ for now. This is planned but not started yet.

Oct 20 '25 13:10 Klaim