incubator-uniffle icon indicating copy to clipboard operation
incubator-uniffle copied to clipboard

[Feature] Introduce the taskId in the spark client side log

Open zuston opened this issue 3 years ago • 8 comments

Motivation

When multiple tasks are running in the same executor at the same time, it will be hard to analysis the rss log belonging to the specified task. To solve this, it's better to make log show task id in the rss client codebase.

How to do

I think we should directly use the MDC to put the context info.

POC screenshot

image

zuston avatar Sep 09 '22 09:09 zuston

What do u think? @jerqi

zuston avatar Sep 09 '22 09:09 zuston

MDC seems heavy for us.

jerqi avatar Sep 14 '22 08:09 jerqi

Could you share me that what are you concerned about most? The cost of refactor or performance?

The cost of refactor looks easy for me. We just need to inject some infos into MDC which is hold by internal thread local vars.

zuston avatar Sep 14 '22 08:09 zuston

Could we adjust the log4j to print thread ID?

jerqi avatar Sep 14 '22 10:09 jerqi

Could we adjust the log4j to print thread ID?

This way looks a little bit ugly on propagating thread id/ task id in multiple threads when using thread pool.

zuston avatar Sep 14 '22 10:09 zuston

What do u think? @jerqi

zuston avatar Sep 20 '22 02:09 zuston

It's a little abstraction for me, maybe you can raise a draft pr and let me look at it.

jerqi avatar Sep 26 '22 11:09 jerqi

It's a little abstraction for me, maybe you can raise a draft pr and let me look at it.

Yes.

zuston avatar Sep 27 '22 02:09 zuston