rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

RFC-0026-logging-system

Open kurtamohler opened this issue 2 years ago • 9 comments

RFC for a consistent C++/Python message logging system in PyTorch

Feature was requested in https://github.com/pytorch/pytorch/issues/72948

kurtamohler avatar Jun 22 '22 21:06 kurtamohler

Seems like a good proposal in general, @kurtamohler. I think it'd be helpful to expand this to the next level of detail and show how some of these scenarios would work, and how errors and warnings can be originated in both Python and C++

mruberry avatar Jun 27 '22 15:06 mruberry

Would it be at all helpful if I include a somewhat full description (or links to documentation when possible) for the current message APIs in PyTorch? I am going to write notes on it regardless for my own reference, and I can include it here if it's a good idea.

EDIT: My notes on the current messaging APIs are here: https://github.com/kurtamohler/notes/blob/main/pytorch-messaging-api/current_messaging_api.md

kurtamohler avatar Jul 05 '22 22:07 kurtamohler

Would it be at all helpful if I include a somewhat full description (or links to documentation when possible) for the current message APIs in PyTorch? I am going to write notes on it regardless for my own reference, and I can include it here if it's a good idea.

EDIT: My notes on the current messaging APIs are here: https://github.com/kurtamohler/notes/blob/main/pytorch-messaging-api/current_messaging_api.md

Yep, that's a great idea

mruberry avatar Jul 11 '22 19:07 mruberry

I've added a lot more detail, including a description of how message logging is currently done in PyTorch, and fixed some of the things brought up in discussion so far.

I haven't described all of the APIs for creating messages in the new system yet--I will do that next

kurtamohler avatar Jul 14 '22 02:07 kurtamohler

I've added a lot more detail, including a description of how message logging is currently done in PyTorch, and fixed some of the things brought up in discussion so far.

I haven't described all of the APIs for creating messages in the new system yet--I will do that next

Hey @kurtamohler! FYI I'll be on PTO for the next several weeks, and I think @albanD is on PTO currently. So this may take some time to respond to.

mruberry avatar Jul 14 '22 15:07 mruberry

we have torch.monitor (https://github.com/pytorch/rfcs/pull/30) that hasn't seen too much usage, perhaps we can use some of it for the new logging system.

edward-io avatar Jul 30 '22 05:07 edward-io

How is one supposed to selectively change the log verbosity of a specific subsystem? And how would a subsystem author produces logs that are properly scoped?

While having functions like torch.log_info is nice, it would be great if we could construct logger instances for particular modules. For example:

# distributed.py

logger = torch.Logger("torch.distributed")

def broadcast():
 logger.log_info("broadcasting stuff")

Finally, is there a reason on why we can't integrate with python's logging package so there's very little users need to do to leverage this?

kumpera avatar Aug 02 '22 20:08 kumpera

@kumpera, good questions.

How is one supposed to selectively change the log verbosity of a specific subsystem?

At the moment, the general idea is that users will use a filter to silence messages. The specific API for that hasn't been written down yet. Although, in the case of warnings, the warnings module already does this. The missing detail is how the user would silence info messages, and I imagine that we would want to have something that has a very similar interface to the warnings module filter.

And how would a subsystem author produces logs that are properly scoped?

Authors would use the appropriate message class. If an applicable message class doesn't exist yet, they should add it.

While having functions like torch.log_info is nice, it would be great if we could construct logger instances for particular modules.

That might be a good idea. What do you think @albanD, @mruberry ?

Finally, is there a reason on why we can't integrate with python's logging package so there's very little users need to do to leverage this?

I don't know much about it, I'll read about it and get back to you

kurtamohler avatar Aug 04 '22 14:08 kurtamohler

And how would a subsystem author produces logs that are properly scoped?

Authors would use the appropriate message class. If an applicable message class doesn't exist yet, they should add it.

Sorry for not being specific here, what I meant by scoping is which subsystem is producing a given log message.

For example, if one is troubleshooting an issue on backwards of a distributed model, they would enable logging for the "autograd" and "distributed" subsystems.

I guess this boils down to whether log messages would carry a subsystem tag that can be used as part of filtering.

kumpera avatar Aug 06 '22 15:08 kumpera