clouddriver icon indicating copy to clipboard operation
clouddriver copied to clipboard

feat(kubernetes): add visibility to kubectl job executor by adding lo…

Open apoorvmahajan opened this issue 4 years ago • 2 comments

…gs and extending kato task object to include stdout and stderr info

Purpose

  • We have seen a few transient errors spanning across Orca and Clouddriver when the Clouddriver's Kubernetes Kubectl Job Executor attempts to perform a task (mainly the Deploy Manifest task). To gain more visibility into what was the last task action, we felt the need to extend the Kato Task object to include stdOut and stdError information in it as the current logs in the system do not capture that effectively.

This PR is meant to enable us to gain more visibility by adding more logs to the Deploy Manifest operation as well as to the KubectlJobExecutor. It also extends the kato task object to include stdOut and stdError information for all the manifests being processed in such a task.

Changes

  • Add logging statements and update existing logs for various common Kubernetes tasks to include more information about which manifest is being executed by the K8s job executor and for which account
  • Extend Kato task object to include stdout and stderr info for the task in question
  • Add implementation for the updated kato task object across in memory, redis and sql implementations

apoorvmahajan avatar Jan 07 '21 20:01 apoorvmahajan

Task Cleanup output:

-01-27 03:11:53.421 DEBUG 1 --- [ecutionAction-4] c.n.s.c.cache.LoggingInstrumentation     : com.netflix.spinnaker.clouddriver.core.provider.CoreProvider:SqlTaskCleanupAgent starting
2021-01-27 03:11:53.421 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : select id, task_id from task_states where (state in (?, ?) and created_at <= ?)
2021-01-27 03:11:53.421 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : select id, task_id from task_states where (state in ('COMPLETED', 'FAILED') and created_at <= 1611371513421)
2021-01-27 03:11:53.421 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched result           : +--------------------------+--------------------------+
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |id                        |task_id                   |
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+--------------------------+
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD3MZHSSXKK1PWKAGV4G2|01EX0WD036534XNM9ZPNG2YME4|
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+--------------------------+
2021-01-27 03:11:53.428 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched row(s)           : 1
2021-01-27 03:11:53.430 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : select id from task_states where task_id in (?)
2021-01-27 03:11:53.430 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : select id from task_states where task_id in ('01EX0WD036534XNM9ZPNG2YME4')
2021-01-27 03:11:53.430 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched result           : +--------------------------+
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |id                        |
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD3MZHSSXKK1PWKAGV4G2|
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD03831ZN7JZ4D0TEN72A|
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD08C1PTKNS6SPDKY3JZS|
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD09E8TACREZ7Z43YBRSZ|
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD0A375A654FYQPMARBMZ|
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |...record(s) truncated...
2021-01-27 03:11:53.433 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched row(s)           : 16
2021-01-27 03:11:53.434 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : select id from task_results where task_id in (?)
2021-01-27 03:11:53.434 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : select id from task_results where task_id in ('01EX0WD036534XNM9ZPNG2YME4')
2021-01-27 03:11:53.434 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched result           : +--------------------------+
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |id                        |
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD3M0GH7R5H4ZNCX0MYXD|
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched row(s)           : 1
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : select id from task_outputs where task_id in (?)
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : select id from task_outputs where task_id in ('01EX0WD036534XNM9ZPNG2YME4')
2021-01-27 03:11:53.437 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched result           : +--------------------------+
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |id                        |
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : |01EX0WD3H0WQVQ8GSKBVZC7G1Q|
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            :                          : +--------------------------+
2021-01-27 03:11:53.440 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Fetched row(s)           : 1
2021-01-27 03:11:53.441  INFO 1 --- [ecutionAction-4] c.n.s.c.sql.SqlTaskCleanupAgent          : Cleaning up 1 completed tasks (16 states, 1 result, 1 output objects)
2021-01-27 03:11:53.443 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.444 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : false
2021-01-27 03:11:53.445 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : delete from task_results where id in (?)
2021-01-27 03:11:53.445 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : delete from task_results where id in ('01EX0WD3M0GH7R5H4ZNCX0MYXD')
2021-01-27 03:11:53.447 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Affected row(s)          : 1
2021-01-27 03:11:53.447 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : commit
2021-01-27 03:11:53.451 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : true
2021-01-27 03:11:53.452 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.454 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.455 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : false
2021-01-27 03:11:53.456 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : delete from task_states where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
2021-01-27 03:11:53.456 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : delete from task_states where id in ('01EX0WD3MZHSSXKK1PWKAGV4G2', '01EX0WD03831ZN7JZ4D0TEN72A', '01EX0WD08C1PTKNS6SPDKY3JZS', '01EX0WD09E8TACREZ7Z43YBRSZ', '01EX0WD0A375A654FYQPMARBMZ', '01EX0WD0AHCNRQ28SADN95AF3H', '01EX0WD0BFBB40HCS4K1YZ555M', '01EX0WD0BXHH0KGZ1SRJJG5DFP', '01EX0WD0CCZHH4X8ZENDTP3GGE', '01EX0WD0CS9YMCTGA8RTKZSD3W', '01EX0WD1E1M1VK3F78SKHQDVQE', '01EX0WD1EYBX9CTEAF2G661KYF', '01EX0WD1FDY44A6BPN7QQHVXRK', '01EX0WD3HMEA9SD8Z2T2E1Z1XX', '01EX0WD3J3M5AAT7TJV2XPZCJT', '01EX0WD3KFC0MHCKND7254DKTK')
2021-01-27 03:11:53.458 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Affected row(s)          : 16
2021-01-27 03:11:53.459 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : commit
2021-01-27 03:11:53.462 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : true
2021-01-27 03:11:53.464 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.465 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.466 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : false
2021-01-27 03:11:53.467 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : delete from task_outputs where id in (?)
2021-01-27 03:11:53.467 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : delete from task_outputs where id in ('01EX0WD3H0WQVQ8GSKBVZC7G1Q')
2021-01-27 03:11:53.469 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Affected row(s)          : 1
2021-01-27 03:11:53.469 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : commit
2021-01-27 03:11:53.475 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : true
2021-01-27 03:11:53.476 DEBUG 1 --- [ecutionAction-4] com.zaxxer.hikari.pool.PoolBase          : tasks - Reset (isolation) on connection com.mysql.cj.jdbc.ConnectionImpl@4b4b39dd
2021-01-27 03:11:53.479 DEBUG 1 --- [ecutionAction-4] o.s.jdbc.datasource.DataSourceUtils      : Fetching JDBC Connection from DataSource
2021-01-27 03:11:53.480 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : false
2021-01-27 03:11:53.481 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Executing query          : delete from tasks where id in (?)
2021-01-27 03:11:53.481 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : -> with bind values      : delete from tasks where id in ('01EX0WD036534XNM9ZPNG2YME4')
2021-01-27 03:11:53.483 DEBUG 1 --- [ecutionAction-4] org.jooq.tools.LoggerListener            : Affected row(s)          : 1
2021-01-27 03:11:53.484 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : commit
2021-01-27 03:11:53.487 DEBUG 1 --- [ecutionAction-4] org.jooq.impl.DefaultConnectionProvider  : setting auto commit      : true

apoorvmahajan avatar Jan 27 '21 03:01 apoorvmahajan

There are additional optimizations that have been discovered as part of running this code in our environments. @apoorvmahajan or myself will be submitting updates in the next couple of weeks sometime.

clanesf avatar Mar 10 '21 19:03 clanesf