composer
composer copied to clipboard
MLFlowLogger slows down the throughput twice than wandbLogger
🚀 Feature Request
I found that MLFlowLogger slows down the throughput twice than wandbLogger. I saw that there are lots of "import mlflow" in https://github.com/mosaicml/composer/blob/dev/composer/loggers/mlflow_logger.py, is that root cause? Thanks
Motivation
[Optional] Implementation
Additional context
@viyjy can you please detail your workload so we can try to reproduce this?
I am training Mosaic/diffusion model by following this yaml. I have tried to replace this wandb logger by mlflow but found that will decrease the throughput by 10 times.
I realized that the reason might be that there are too many monitors. Therefore, I tried to only keep [speed_monitor, lr_monitor] (https://github.com/mosaicml/diffusion/blob/main/yamls/hydra-yamls/SD-2-base-256.yaml#L76C3-L76C3) and found that the throughput of wandb logger is still 2 times higher than the mlflow logger.
hi @viyjy, with this PR in, I think your throughput issue should be fixed. You can try it out by running off of dev or wait until a patch release next week.
This should be fixed in 0.16.1! Feel free to reopen if it's still an issue