Juleps
Juleps copied to clipboard
Pkg3: telemetry
I'd like to see Pkg3 support some kind of anonymized opt-out telemetry.
For example, it could keep track of statistics like like how many times Pkg.add
has been called for a given package with a heuristic to not count CI testing. This would give us a more reliable guide of package popularity than github stars.
In terms of actual implementation, inspiration could be taken from Homebrew, which has opt-out telemetry and is MIT licensed (so there's no need to worry about derived works).
A few thoughts on this;
-
Keeping track of things such as "number of
Pkg.add()
operations on a given package" should be tracked, but should also be tied to some kind of anonymized identifier, likesha256(submitting_ip)
so that if someone decides to mess with us and submit 1e9 installations of something, we can at least separate that out from other things. It would also be potentially interesting to look at the pairings of packages; e.g. 90% of the people that have installedPlots.jl
also havePyPlot.jl
installed, or something. -
I'd like to know basic installation metadata; not just what version of Julia is running, but on what architecture/OS; do we, unbeknownst to us, actually have thousands of tiny armv7l machines running Julia? We don't know, and we won't know until something like this exists.
-
We kicked around a lot of ideas about the right user interface for this in the original issue. The basic consensus was that we need to ensure that (a) this degrades VERY gracefully when an internet connection is not available/the remote server collecting these stats is broken, (b) the whole thing should default to not collect data, but if the session is somehow "interactive" (such as having a TTY) the first Pkg operations for that install would prompt the user, asking if they want to opt-in. Whatever answer they gave would be saved to a file, and the user could run
Pkg.phone_home(true|false)
to change it later if they wished.
Note that if Pkg3 downloads tarballs rather than git clones we might also be able to get download counts via GitHub releases.
@simonbyrne You can already get info in git clone counts, as it is, from github. But only for repo's you own. Eg I can see them for NamedColors.jl at: https://github.com/JuliaGraphics/NamedColors.jl/graphs/traffic
http://stackoverflow.com/a/38217108/179081
However it is only counters -- not information about the OS, julia version etc. But I assume same would be true for releases.
Very happy to see this discussion.
I just want to chime in to say the following: this kind of telemetry could be very useful for developers that are considering to do breaking changes. Will also be a good indication to know whether you need deprecations or even an announcement.