druid
druid copied to clipboard
Task logs UI is empty for the new Indexer daemon.
Affected Version
v0.23.0 and v0.22.1
Description
Task logs UI is empty for the new Indexer daemon. This is because they are all running under 1 daemon so the tasks are logged into application logs via log4j.
Preferably, the task log is populated again inside their dedicated tasks folder just like how middle manager does it so that the logs can be displayed in the UI.
The default log4j has been configured to output to files.
For indexer you need to change the log4j configuration to print log to console.
Unfortunately, there are quite a number of classes to whitelist into log4j2 file.
org.apache.druid.indexing.worker.WorkerTaskManager # important
org.apache.druid.segment.loading.SegmentLocalCacheManager
org.apache.druid.indexing.common.task.AbstractBatchIndexTask
org.apache.druid.discovery.DruidLeaderClient
org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask
org.apache.druid.indexing.common.task.batch.parallel.TaskMonitor # important
org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexPhaseRunner
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl
org.apache.parquet.hadoop.InternalParquetRecordReader
org.apache.druid.indexing.overlord.ThreadingTaskRunner
So, ThreadingTaskRunner.java already knows the taskDir:
final File taskFile = new File(taskDir, "task.json");
If ThreadingTaskRunner.java can tee the above class logs into:
final File taskFile = new File(taskDir, "log");
Then Indexer should be backward compatible to MM, yes?
Or, another way to do this is for Indexer to set taskDir into:
System.setProperty("indexerTaskDir", taskDir);
and then log4j2.xml picks it up using:
appender.rolling.fileName = ${sys:indexerTaskDir}/log
And then whitelist all those class names above inside log4j2.xml
But, I don't know if log4J constantly monitors changes on system properties, or does it just load the properties once during boot.
What do y'all think?
ah, the second idea is just not gonna work concurrently.
The ForkingTaskRunner has this defined: final File logFile = new File(taskDir, "log");
What's stopping ThreadingTaskRunner to do the same thing?
ForkingTaskRunner starts a new OS process and directs that process stdout and stderr to the logFile. We are relying on the fact that all output for an OS process goes through the same stream and that stream can be directed anywhere we want. With ThreadingTaskRunner, we don't have a way to direct the logging for individual tasks/threads etc.
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.
This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.