hubble icon indicating copy to clipboard operation
hubble copied to clipboard

Versions chart broken if load balancer is used?

Open larsxschneider opened this issue 6 years ago • 2 comments

The versions chart uses the haproxy.log to determine the Git client versions in use. This has two disadvantages:

  • only HTTPS Git connections are monitored
  • if a load balancer is used then the source IP is 127.0.0.1 in the haproxy log and therefore we cannot distinguish different machines

larsxschneider avatar Jun 05 '19 10:06 larsxschneider

This snippet should get the versions from the audit log:

$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=git\/(\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort -r -V | uniq -ic

This should also include SSH connections. However, I wonder should take any Git client into considerations with a regex like this (note: I use the user instead of IP here too):

$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=git\/(\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort -r -V | uniq -ic

Even further we could look for other agents:

$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=([^\/" ]+\/\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort | uniq -ic

This would also record JGit clients and the output would look like this:

      1 	git/2.9.0
      2 	git/2.9.2
      3 	git/2.9.3
      4 	JGit/4.5.0
      5	JGit/4.6.1
      6 	JGit/4.8.0

@pluehne Could that be easily be made compatible with your versions check logic? Could everything not git/* be considered "unknown"?

larsxschneider avatar Jun 05 '19 10:06 larsxschneider

@larsxschneider: I just realize that you addressed most of this issue in #185, which we already merged.

However, I like the idea of recording the client agents in git-versions.tsv. We could extend the chart to support a mixed data format (with pure versions or versions prefixed by the client agent). Alternatively, we could also write a data migration that prefixes all older entries with git/ to make them compatible (would allow for a cleaner implementation on the dashboard side). What do you think?

pluehne avatar Jun 24 '19 13:06 pluehne