go-mysql icon indicating copy to clipboard operation
go-mysql copied to clipboard

Add better support for thread id

Open winmutt opened this issue 6 years ago • 10 comments

Thread_id was inadvertently (?) picked up in NumberMetrics but not being parsed out of UserHost. This change will parse it out of UserHost and add it to NumberMetrics without significantly impacting the expected output.

winmutt avatar Sep 12 '18 21:09 winmutt

Codecov Report

Merging #38 into master will increase coverage by 0.08%. The diff coverage is 79.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #38      +/-   ##
==========================================
+ Coverage   67.02%   67.11%   +0.08%     
==========================================
  Files           7        7              
  Lines        1389     1411      +22     
==========================================
+ Hits          931      947      +16     
- Misses        348      352       +4     
- Partials      110      112       +2
Impacted Files Coverage Δ
log/log.go 100% <ø> (ø) :arrow_up:
event/class.go 85.71% <100%> (+0.93%) :arrow_up:
log/slow/parser.go 71.79% <100%> (+0.74%) :arrow_up:
event/aggregator.go 82.69% <33.33%> (-10.34%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f5cfaf6...b6be54d. Read the comment docs.

codecov[bot] avatar Sep 12 '18 21:09 codecov[bot]

Hi @winmutt, thank you for your contribution. The only issue is that Thread_id is not really a NumberMetric. NumberMetrics can be aggregated e.g. Rows_examined = 2 and Rows_examined = 6 produce average metric 4. Aggregation that produces averages shouldn't be however applied to Thread_id. The aggregate of Thread_ids can be a list of those ids, not average. I would suggest adding ThreadID as separate field in Event struct for now. What do you think?

p.s. May I ask in which project/for what you are using this library?

arvenil avatar Sep 18 '18 14:09 arvenil

@arvenil , thanks for the feedback. This had been my initial intention (I even did the work as such and had to roll it back), but was concerned with changing the interface, eg the data is already being caught in NumberMetrics. I'll happily redo that work and move it back out to Event. Do you think the existing behavior should remain in order to ensure backwards compatibility?

We have a need to parse and aggregate slow query logs by thread id, rather than checksum/fingerprint.

winmutt avatar Sep 18 '18 17:09 winmutt

@winmutt Hey.

We have a need to parse and aggregate slow query logs by thread id, rather than checksum/fingerprint.

I would say, just finish your implementation and I will accept this PR (or version with Event.ThreadID) when it will be ready. Do you plan to use Aggregator too?

arvenil avatar Sep 18 '18 18:09 arvenil

I have not looked at aggregator yet, but it sounds like I will.

winmutt avatar Sep 19 '18 14:09 winmutt

So this doesn't get closed quite yet, I'll be starting work on it again shortly, been side tracked by a few things.

winmutt avatar Oct 01 '18 19:10 winmutt

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
3 out of 4 committers have signed the CLA.

:white_check_mark: winmutt
:white_check_mark: arvenil
:white_check_mark: gywndi
:x: Rolf Martin-Hoster


Rolf Martin-Hoster seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Oct 16 '18 22:10 CLAassistant

@arvenil Is that what you envisioned with thread_id? Sorry my PR got a little sloppy, having to go back and fix the author email for the match the CLA I managed to pick up some stray old commits.

Also, here is the related project I am working on https://github.com/winmutt/slow_log_parse. So far it was just a POC but now I am focused on replacing our use of pt-query-digest with it (but not in entirety at all). In testing it was ~5x faster and used ~4x less memory than pt-query-digest which we use heavily sampling random 10min every hour across ~1200 db instances.

winmutt avatar Oct 16 '18 23:10 winmutt

I've been looking more closely at the aggregator output and realized I need to add ThreadID there as well, not just the raw events. Incoming commit.

winmutt avatar Oct 18 '18 14:10 winmutt

@arvenil thanks so much for getting back and may your future adventures be grand! I did not consider a list of ID's was shooting for best effort.

@rnovikovP would be happy to help answer any questions you have around this PR.

winmutt avatar Nov 16 '18 00:11 winmutt