Versions chart broken if load balancer is used?
The versions chart uses the haproxy.log to determine the Git client versions in use. This has two disadvantages:
- only HTTPS Git connections are monitored
- if a load balancer is used then the source IP is
127.0.0.1in the haproxy log and therefore we cannot distinguish different machines
This snippet should get the versions from the audit log:
$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=git\/(\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort -r -V | uniq -ic
This should also include SSH connections. However, I wonder should take any Git client into considerations with a regex like this (note: I use the user instead of IP here too):
$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=git\/(\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort -r -V | uniq -ic
Even further we could look for other agents:
$ zcat -f /var/log/github-audit.log.1* | perl -ne 'print if s/.*agent=([^\/" ]+\/\d+(?:\.\d+){0,2}).*"user_id":(\d+).*/\2\t\1/' | sort | uniq | perl -lape 's/\d+ *//' | sort | uniq -ic
This would also record JGit clients and the output would look like this:
1 git/2.9.0
2 git/2.9.2
3 git/2.9.3
4 JGit/4.5.0
5 JGit/4.6.1
6 JGit/4.8.0
@pluehne Could that be easily be made compatible with your versions check logic? Could everything not git/* be considered "unknown"?
@larsxschneider: I just realize that you addressed most of this issue in #185, which we already merged.
However, I like the idea of recording the client agents in git-versions.tsv. We could extend the chart to support a mixed data format (with pure versions or versions prefixed by the client agent). Alternatively, we could also write a data migration that prefixes all older entries with git/ to make them compatible (would allow for a cleaner implementation on the dashboard side). What do you think?