solana icon indicating copy to clipboard operation
solana copied to clipboard

Logging format to include thread name if available

Open apfitzge opened this issue 5 months ago • 2 comments

Problem

  • We often have multiple threads performing similar actions with similar logs
  • There's no current way to determine which thread is logging

Summary of Changes

  • Use a custom formatter to log the thread name, if available.

Example:

[2024-01-23T18:51:28.000226046Z INFO solana_faucet::faucet] [main] Faucet started. Listening on: 0.0.0.0:9900
[2024-01-23T18:51:28.000236416Z INFO solana_faucet::faucet] [main] Faucet account address: 4Qjpy9j3FM2K7QdA8dfEujtmzJ9PqLjXb3MDW5keM3vQ
  • Much of our logging comes from submitted metrics, which are all logged from a single thread: solMetricsAgent. This means the added thread-name is not that useful, except for direct logging. Example below:
[2024-01-23T18:43:46.623877094Z INFO solana_metrics::metrics] [solMetricsAgent] datapoint: qos-service-errors id=1i bank_slot=186i retried_txs_per_block_limit_count=0i retried_txs_per_vote_limit_count=0i retried_txs_per_account_limit_count=0i retried_txs_per_account_data_block_limit_count=0i dropped_txs_per_account_data_total_limit_count=0i
[2024-01-23T18:43:46.623901330Z INFO solana_metrics::metrics] [solMetricsAgent] datapoint: optimistic_slot_elapsed average_elapsed_ms=353i
[2024-01-23T18:43:46.623954581Z INFO solana_metrics::metrics] [solMetricsAgent] datapoint: optimistic_slot slot=185i

Fixes #

apfitzge avatar Jan 23 '24 18:01 apfitzge

I have a branch which will log datapoints from the submitting thread so the proper thread-name appears: metrics-logged-from-submitting-thread

example:

[2024-01-23T19:28:46.632093987Z INFO solana_core::replay_stage] [solReplayStage] datapoint: replay-loop-voting-stats generate_vote_us=210i update_commitment_cache_us=6i
[2024-01-23T19:28:46.632112522Z INFO solana_core::replay_stage] [solReplayStage] datapoint: replay-loop-timing-stats total_elapsed_us=1005000i collect_frozen_banks_elapsed=756i compute_bank_stats_elapsed=3487i select_vote_and_reset_forks_elapsed=9i start_leader_elapsed=68
26i reset_bank_elapsed=116i voting_elapsed=2338i select_forks_elapsed=1i compute_slot_stats_elapsed=24i generate_new_bank_forks_elapsed=33311i replay_active_banks_elapsed=22552i process_ancestor_hashes_duplicate_slots_elapsed=2i process_duplicate_confirmed_slots_elapsed=1
71i process_unfrozen_gossip_verified_vote_hashes_elapsed=2i process_popular_pruned_forks_elapsed=2i wait_receive_elapsed=864820i heaviest_fork_failures_elapsed=0i bank_count=3i process_duplicate_slots_elapsed=3i repair_correct_slots_elapsed=5599i retransmit_not_propagated
_elapsed=0i generate_new_bank_forks_read_lock_us=0i generate_new_bank_forks_get_slots_since_us=29935i generate_new_bank_forks_loop_us=1678i generate_new_bank_forks_write_lock_us=0i replay_blockstore_us=0i
[2024-01-23T19:28:46.754692580Z INFO solana_poh::poh_recorder] [solPohTickProd] poh_record: max_tick_height 31488 reached, clearing working_bank 491
[2024-01-23T19:28:46.754712518Z INFO solana_poh::poh_recorder] [solPohTickProd] datapoint: leader-slot-start-to-cleared-elapsed-ms slot=491i elapsed=349i

apfitzge avatar Jan 23 '24 19:01 apfitzge

Codecov Report

Attention: 35 lines in your changes are missing coverage. Please review.

Comparison is base (a5c470d) 81.7% compared to head (35e156b) 81.6%. Report is 130 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##           master   #34910     +/-   ##
=========================================
- Coverage    81.7%    81.6%   -0.1%     
=========================================
  Files         826      827      +1     
  Lines      223413   223855    +442     
=========================================
+ Hits       182614   182754    +140     
- Misses      40799    41101    +302     

codecov[bot] avatar Jan 23 '24 20:01 codecov[bot]